From Sony to Google to UNC: UNC computer science welcomes Dr. Richard Marks

Richard Marks holds up a PlayStation virtual reality headset with his pointer finger, examining it. — Richard Marks holds a PlayStation virtual reality (VR) headset during his time working for Sony.

In a world where the lines between reality and virtual can increasingly blur, few have their fingerprints on the pulse of technological innovation like Richard Marks. Spearheading game-changing projects at Google and Sony, Marks has been an instrumental force in shaping the way we interact with technology. Now, his transition into academia at UNC-Chapel Hill as a professor in the department of computer science and the School of Data Science and Society has generated excitement over the potential products of his application-focused approach to virtual and augmented reality and machine learning. In an exclusive interview, Marks reflects on the strategies he has developed in the tech industry and offers a glimpse into the future of generative artificial intelligence (AI) as well as what he hopes to bring to computer science (CS) education.

Q: You’ve held industry leadership roles at companies like Google and PlayStation. What prompted this pivot to academia?

A: When I was young, both of my parents were high school teachers, and I always thought that I would like to go into education. I made a conscious decision that I would spend a certain amount of time in school, and then in industry, and then afterwards coming back to school and teaching. I kind of set it as 25-25-25 years. It didn’t quite work exactly that way because I finished my Ph.D. when I was 26, and then I stayed a little longer in industry. But I always had this target, and I thought that having some real-world industry experience would be valuable to come back and share. My thesis adviser at Stanford, Steve Rock, had worked in industry some, and it gave him an interesting perspective that I appreciated. I’m hoping to also share that extra perspective with students.

Q: How have you found the adaptation to UNC and academia so far?

A: The school has been very welcoming. All the people here have been supportive and very nice. It’s been easy to get started.

Q: You have a unique background spanning aeronautics, gaming, project management and more. What parts of that experience do you see yourself drawing from as a faculty member?

A: I was initially computer science as an undergraduate, and as an intern in the summer after my first year, I worked at an aerospace company. I was in the software tools group, and there were a bunch of aerospace engineers who were also sitting at computers all day, also programming things, and I thought, “Wait a minute, I don’t have to be a CS major to be a programmer? I can do other engineering things, also?” Some of the tools I was working on as an intern were very dry and not the things I was interested in, so I talked to a lot of the engineers and found that they were working on real-time code to control the airplane and things like that, which sounded much more interesting to me. So that was when I became aware that I could still be heavy on CS but make it more applied.

That has kind of carried over through my whole career. I’ve always been focused on the application of technology. My team at Sony developed a lot of new technologies but always with an application for them in mind.

One of the things about aeronautics and astronautics in particular is that it is kind of a systems engineering approach. There are a lot of different complex systems in an aircraft or spacecraft, and understanding a little bit about all of those things and making them work together is really a systems engineering problem. Video games are also very much a systems engineering problem. There’s a lot of different components: animation, AI, graphics, interface. So I guess that’s what I’ve brought to my whole career, being a generalist and trying to take a little bit of knowledge about a lot of things and collectively piece that together in some cohesive way. I do think it’s important to go deep on some things, so computer vision was the focus for my thesis and throughout a lot of my career. I also looked for that in hiring, candidates who had one category of depth and a wide range of breadth.

Q: How do you believe your industry experience will shape the educational experience for your students at UNC-Chapel Hill?

A: One of the things that led me to follow the path I did was that when I asked a professor or a high school teacher, “When will this be used?” sometimes there would be no good answer. I want to always have a use case that I can relate classwork back to. In the class I’m currently teaching, I have a lot of videos from my time at Sony demonstrating the techniques the students are learning, so the students have very concrete examples of how this could matter or has mattered in the past.

I also get a lot of students asking for career advice. I worked at a startup, at a big company and as a consultant, so I’ve been in a lot of the different kinds of job situations you could be in. So when they’re looking for jobs, I can give some advice and get a feel for their interests and the types of roles they might like best. I’m not a career counselor, but I try to be helpful.

Q: You have a position in the School of Data Science and Society (SDSS) and a joint position in the department of computer science. How do you see those disciplines interacting with each other, and how does your work apply to both? Are there specific projects or collaborations you’re eager to embark on?

A: I think a lot of the need and the vision for SDSS was understood by CS. It was a focus area that needed more attention from the university. In my work at Sony, I was much more focused on things that were not data science, but when I went to Google, my team there did a lot of machine learning and deep learning. A lot of my career would be considered computer science, but the most recent portion was data science, so it feels natural to me to be in both worlds. Right now, I’m really interested in large language models and generative AI. It seems like that’s what everyone in the world is interested in right now. But that power combines well with computer graphics and virtual reality.

I haven’t really started my research here at UNC yet, but I’ve spoken with a lot of professors in designing and teaching my course, and I’m excited to get to work at the intersection of graphics, virtual reality (VR), augmented reality (AR) and generative AI. There are other professors working on various aspects of these things, and I am already talking to them about projects to collaborate on.

Q: You’re teaching Introduction to VR and 3D Graphics this semester. What interesting things are you doing in that course?

A: It’s a bit of an accelerated time scale because I think 3D graphics could be a class unto itself, but I really wanted to teach more of the VR side, so I put 3D graphics on the front end so that the students would understand a lot of the concepts you need in order to do VR. We’ve been through the fundamentals of graphics, and we’re now applying those fundamentals using the game engine Unity to create things and interact with graphical worlds. The next step coming up is to give everyone a VR headset so that the last homework will be writing code for the VR headset. The rest of the class is the final project where students will work in pairs to make a VR experience of their own design.

This kind of goes back to the previous question, but at some point in the semester, I want to teach one lecture in my class about generative AI and some of the ways that it could be used. One example is that when you go through a virtual world, the content of that world is currently mostly hand-authored by artists. It’s very expensive and time-consuming to author all of that content. Generative AI has proven to be pretty good at generating images, for example in Midjourney or DALL-E. They’re working on similar models to generate 3D content. I could walk into a new virtual room, and the content of the room could be automatically generated for me. Another example is that concept artists are using AI to help generate concept art that the game artist can use for inspiration. It’s still the artists authoring content, but having AI involved can be useful.

Q: What’s your philosophy on innovation, especially in such rapidly evolving fields?

A: At PlayStation, we called ourselves “experience engineers.” Basically, when you want to explore a technology, you pick an experience and quickly engineer a prototype to understand as much as you can about it but not necessarily to ship that thing or turn it into a product. You’re mostly trying to understand what the technology offers, what’s missing from it and what parts matter to the experience. Innovation is always a loop between technology and the application of the technology. It’s really hard to separate them and say you’re only a technology person or only an application person. Maybe not everyone can be completely encompassing, but at least have a communication path between the people on the product side and the research side so you can hand something over to try and have them tell you it’s not what they need.

And this is not new information; many people talk about “failing fast.” I don’t know that “failing” is the right word; I think “exploring” is the right way to think about it. I encourage all of my students not to just read about something but to try to use what you’ve learned somehow to get a better understanding. For example, when I was learning trigonometry, I found it really dry and unexciting. But I started plotting the trigonometry functions on a computer, and it just made me care more about it. It gave me something to concretely try to make rather than memorizing these abstract trigonometry identities.

Q: After seeing a lot of advanced research projects firsthand at Google ATAP, do you have any guesses as to what will be the “next big thing” in CS or tech?

A: What we’re finding now with generative AI and large language models is that they’re pretty powerful, and a lot of the problem is asking for what you want. The interface to ask is currently typing a lot of text, and sometimes it doesn’t give you exactly what you want, so you have to add even more text. So over time, there’s an improvement as we learn how to give prompts that get better results. That’s the way we used to be with handwriting recognition. The first versions wouldn’t recognize everything, and you needed to learn how it recognized your writing and change your writing to help the computer. Eventually, we got better at this, and it learned to better recognize our writing, so we didn’t have to adapt to it. So the goal is always to engineer so that the computer adapts to us, rather than the other way around.

When I was at Sony, I was really interested in smart speakers and even bought one for my father. He told me soon after that he wanted to return it because it was broken. I asked what he meant, and he said, “Well, I asked it ‘When is golf?’ and it didn’t know.” That’s a very ambiguous question. If he’s standing in front of the television on a day that the Masters is being played, then he wants to know what time the tournament coverage starts. But if he’s wearing his golf shoes and carrying a bag on his back, he’s wondering when his tee time is. Those are very different things from the same question, but with enough context, the smart speaker could probably get it right. Getting that context will be the next trend to make these devices work better for us.

By Brett Piper, the Department of Computer Science