The Englander Institute for Precision Medicine’s Director Olivier Elemento, Ph.D. was recently interviewed by LDV Capital in advance of their 9th Annual LDV Vision Summit – “the premier global gathering in visual tech” – hosted virtually on March 28, 2023.
Which sectors over the next 10 years will have the most societal impact from research advancements leveraging visual technologies and artificial intelligence? In the lead-up to our 9th Annual LDV Vision Summit – the premier global gathering in visual tech – hosted virtually on March 28, we addressed this and 4 other burning questions to 10 brilliant professors & researchers.
Which sector over the next 10 years do you believe will have the most societal impact from research advancements leveraging visual technologies and artificial intelligence?
Dr. Lydia Chilton, an Assistant Professor of Computer Science at Columbia University: Generative image technology like Dall-E, Midjourney and Stable Diffusion are already having an impressive impact. In the next ten years these technologies will get much better, and they’ll also be able to produce video. We’ll be able to help low-literacy patients understand and remember their health care instructions, create exciting and personalized education for children, and make scientific diagrams and demonstrations quickly and easily.
Dr. Olivier Elemento, Director of Englander Institute for Precision Medicine, Associate Director of Institute for Computational Biomedicine, Associate Program Director of Clinical & Translational Science Center: Mixed reality – superimposing virtual objects and information into the real world. We have several projects involving Microsoft Hololens technology at our Institute, for example for visualizing complex biological networks or training our staff to perform complex experiments. I have been blown away by the potential of mixed reality to dramatically change how we visualize information (why look at a screen if you can have all the information delivered to your eyes directly), interact with and enhance the real world and generally augment our cognitive abilities. I anticipate that mixed reality will increasingly leverage AI capable of understanding our environment.
Dr. Mackenzie Mathis, the Bertarelli Foundation Chair of Integrative Neuroscience and an Assistant Professor at the Swiss Federal Institute of Technology, Lausanne (EPFL): I hope clinical medicine will start to embrace visual technologies for patient care, and I think that’s possible on a 10 year horizon. It could have huge therapeutic implications to be able to use non-invasive video data to help guide smart rehabilitation, even at home. Right now it is not the standard of care for patients suffering from motor disorders, such as stroke or Parkinson’s, to have visual tech aids, but it’s already becoming possible in research labs. I’d love to see a world where that technology is as easy to use as putting an Alexa on your kitchen counter.
Dr. Fernando De La Torre, a Research Associate Professor at Carnegie Mellon University: In the coming decade, various industries will experience significant societal impact from mature visual technologies, such as autonomous driving, healthcare, retail, agriculture, or manufacturing. Medical research is the most likely to have far-reaching consequences for society. The combination of visual technologies and artificial intelligence has the potential to greatly enhance medical diagnosis through the automatic analysis of imaging techniques like x-rays, CT scans, or MRIs. In addition, extended reality (AR, VR, MR) has transformative potential in training surgeons and educating of other medical professionals, as well as treatment of some medical conditions (e.g., PSTD, anxiety). Although many of these technologies have already had some impact, I anticipate significant advancements over the next ten years.
Personally, I find the possibilities of 2D/3D digital humans to be quite exciting. These 2D/3D human models are designed to look, sound, and behave like real humans and can transform the way we communicate, work, learn and connect with others adding a better sense of presence. In addition, this technology can be used for improving education and training, improved accessibility (e.g., virtual companionship), or high-quality personalized content creation in entertainment (e.g., gaming, movies).
Dr. Matt Uyttendaele, Director at Meta AI (Fundamental AI Research team): Climate change may make large areas of the planet uninhabitable in the coming 50 years, so anything we can do to address that will have huge societal impact. There are many opportunities to leverage computer vision and AI to combat climate change. Measuring carbon in the earth’s biomass, detecting methane leaks, and picking optimal sites for solar installations are just a few examples of important climate-related problems that can be solved through AI trained to analyze the planet. My recent focus has been to transfer some of the deep neural net (DNN) technology from core computer science (vision, speech, language) to material science. It turns out that many of the DNN architectures and other data-driven techniques transfer over well and could be used to discover new materials that function to better generate clean hydrogen or capture CO2 from the atmosphere.
Dr. Karen Panetta, Dean of Graduate Education for the School of Engineering at Tufts University: Ocean observing systems for conservation, exploration and sustainability are new arenas for visual technologies to make transformative impacts. There are so many applications that can benefit from cost-effective visual technologies including aquaculture for harvesting new food sources, protecting aquatic animals and plant life when installing new pipelines and offshore platforms or proactive early detection of corrosion that could lead to environmental disasters.
Dr. Serge Belongie, a Professor of Computer Science at the University of Copenhagen, where he also serves as the head of the Danish Pioneer Centre for Artificial Intelligence: Augmented Reality. In view of some high profile product failures by big companies, I know it’s almost a joke to say “AR is 5 years away,” but I believe it. I don’t know which company will make the necessary breakthroughs, but I’d put my money on Apple. Once it arrives, I anticipate that AR – whether or not it’s called that – will become the primary medium through which we will experience AI in our day-to-day lives. Will it be in the form of glasses? Or something projector-based in your home or car? Perhaps it will be all of these.
Dr. Ramesh Raskar, an Associate Professor at MIT Media Lab: Robotic automation especially home robotics. Human emotion driven movies and media. Eliminating disabilities like blindness and low vision challenges.
Dr. Nikhil Naik, Director of AI Research at Salesforce: Creative arts (movies, design), biotech (drug discovery).
What visual tech-powered product do you hope exists in 20 years?
Dr. Lydia Chilton: AI-generated video is the dream. It would be ideal if it could also augment existing videos. For example, I’d love to show kids how electricity flows, or how a toilet flushes by augmenting videos they take in their own home to illustrate the science. Having visuals situated in people’s existing environment helps connect new concepts with their reality. It might even make IKEA instructions clear.
Dr. Olivier Elemento: Mixed reality devices like the Hololens are too bulky to become broadly used. I’d love to see smart contact lenses or perhaps more likely connected eyeglasses capable of mixed reality.
Dr. Mackenzie Matis: I confess I think augmented reality will have a big impact, and I imagine walking around a lovely English garden and instead of pulling out a phone to take a picture to classify a rose, I seamlessly could read text next to the plant, or listen to a transcription in my modern AirPods. Also, this could be very useful for everyday tasks like shopping in the grocery store – we can all stop touching avocados and just get an instant prediction of how ripe it is!
Dr. Fernando De La Torre: I work in the Robotics Institute at Carnegie Mellon University, where I have witnessed the remarkable progress of robotics over the last two decades. While we have already seen the emergence of robotics devices such as vacuum cleaners, lawn mowers, toys, and window cleaners, these robots are currently limited in their capabilities and can be expensive. However, it is conceivable that in the next 20 years, we may have robots that are powered by AI and vision and can perform a wider range of household tasks. Imagine having a robot that can handle all your housework, from cooking and cleaning to laundry. Despite the many challenges that still need to be addressed, such as enabling robots to operate in complex and challenging environments, handle objects of varying shapes and sizes, and perform tasks with precision and accuracy, the potential benefits are immense. With robots taking over household chores, we might finally be able to put an end to the age-old arguments over whose turn it is to load the dishwasher!
Dr. Matt Uyttendaele: A much more precise digital twin of Earth. I believe to solve this would require visual tech to be integrated with other sensing modalities and AI. The result would be a better ability to predict short- and long-term climate trends and a deeper understanding of how humans impact the planet.
Dr. Karen Panetta: Low-cost sensors that can help humans visualize more information from multiple imaging modalities and not separate sensors that only look at RGB or thermal images separately.
Dr. Serge Belongie: Something akin to a dæmon from His Dark Materials or a cookie from the “White Christmas” episode of Black Mirror, minus the dystopia thing.
Dr. Gaile Gordon: One of the most pressing crises we face as a society is affordable housing. I would love to see our collective creativity leveraged to increase the quality, efficiency, and cost of prebuilt housing. There should be great opportunities to leverage visual technology and AI to speed the work of architects in adapting designs to local site needs and aesthetic goals, to automate quality offsite manufacturing, and perhaps even automate delivery.
Dr. Ramesh Raskar: Co-pilot in the real world with AR glasses.
Dr. Nikhil Naik: I believe that AI technology is accelerating so quickly that it is impossible to predict what may be possible in 20 years! I hope that a visual tech-powered product that will enable any person to generate and consume production quality immersive content will be available in the next 5 years.
Do you have any advice for researchers or professors who are interested in commercializing their research? What can inspire them to finally take that first step out of academia and turn their research into a valuable business?
Dr. Lydia Chilton: Talk to people. As a researcher, it’s so easy to get caught up in just making the tech work. But to develop the business use case, you have to see how your ideas and your technology would actually impact people’s lives. Not everyone will have a use case for you, but somebody will. It’s out there, but you won’t find it by reading. You have to engage with people.
Dr. Olivier Elemento: My advice if you want to commercialize your research: start a company. This is an incredibly rewarding experience, much more rewarding and fun than simply licensing your research IP to another company. If you decide to go that route, my number one advice is to build a team – you can’t do it alone, especially if you want to keep your day job in academia. There are tons of great ideas in academia, and most never leave the lab, because many scientists are too busy and don’t have the skills and experience required to start a company. The best way to address this is to join forces with individuals who have time and needed experience.
Dr. Mackenzie Matis: Honestly, the people around you and the product. One, you have to have a passion for the eventual product, and two you need the right team who you trust and want to grow with you. I think it’s not unlike starting ambitious collaborative research projects in this way.
Dr. Fernando De La Torre: With the knowledge I have now, there are few things I would approach differently. First, for those who are commercializing technology for the first time, it may be worthwhile to consider hiring business professionals to assist in navigating the process. As a professor, your time is best spent on technical work, so finding someone trustworthy with prior experience in the commercialization process can be invaluable. Second, conducting market research to identify the gaps in the market, and knowing your competitor very well is crucial. Third, hiring good legal representation maybe expensive, but it is essential. Be upfront about your budget. Fourth, building a team was the most challenging part of the process for me. While you may know which students are the most talented within your group, additional expertise in business or marketing may be required, hire people through references of people you know/trust. Fifth, finding a venture capital firm that can help you execute and provide contacts besides cash. Finally, licensing your technology could be an excellent option for bringing your product to market without the risk and expenses of starting your own company.
For many academics, the thrill of seeing their research make a real-world impact is a main motivation to drive them to start a company. Take my mother, for example, who lives in Spain. She never quite understood why I spent so much time doing mathematical modeling of images, but when she was able to use the technology that we developed for facial-swapping and background subtraction in Portal (video chat device) to connect with her grandchildren, and explain stories, she became a believer! This was a very rewarding moment for me.
Starting a business is also an exciting and rewarding challenge. Academics who thrive on intellectual stimulation can find a new way to apply their skills, and as a business owner, you get to make all the decisions and set the directions for the company work. This is different than working in big corporations, where your capability for making big decision is typically limited. Plus, there is the potential for financial gain, which can provide more freedom in the future to do what you love. Whatever your motivation may be, it’s important to stay passionate about your research and believe in its potential to make a positive impact. And of course, don’t forget to have some fun along the way—after all, a happy researcher makes for a successful business!
Dr. Matt Uyttendaele: If you are applying AI to a problem where your field has yet to explore the benefits of AI as a tool, then there is no better time than now to jump in with both feet. In my career in computer science, I saw many years of incremental progress in speech understanding, computer vision, and natural language processing – then deep nets came along and we’re now witnessing a major inflection point in progress. That same thing will happen in many other domains – proteins, weather, medicine, and chemistry, to name a few.
Dr. Karen Panetta: Get out of the lab and into the arena! Customer development and understanding what customers’ pain points are is essential to developing a compelling case for your products and services. The discovery process is essential. Speaking to those key stakeholders and even individuals not skilled in your topic area. Gathering such diverse perspectives helps shape direction and improve communication of the impact of your company’s mission and value.
Dr. Serge Belongie: AI is in a state of upheaval, which is when the most interesting technologies emerge. Think of when Nokia lost its dominant position, around 2008. That was like a huge tree falling in the forest, tearing up the soil and opening up the canopy for countless innovative technologies to emerge in smartphone hardware and software. Several big companies that were nimble startups a decade or two ago are showing signs of meeting the same fate as Nokia. If you’re watching what’s happening in Generative AI, for example, and thinking “I’ve got a killer idea to drive this home for real customers in a sector I know like the back of my hand,” then go for it. VCs will be ready to fund you, and veteran engineers at the big tech companies will be ready to join your team.
Dr. Gaile Gordon: It is sometimes challenging for researchers or academics to take a step back and see their brilliant work from the eyes of a customer. Take the time to make connections to the end user and understand their perspective. This should be a very inspirational journey, and if it’s not – the direction may need to be reoriented!
Dr. Ramesh Raskar: Always ask ‘What would I do if money and time was not a factor’ for research and then once a year ask ‘What part of my portfolio can have a gigascale (impacting billion people/billion dollars) impact in the next five years? Can I spin that off?’
What is the best and the worst personality trait of an investor?
Dr. Olivier Elemento: Best: humility. Worst: follower.
Dr. Mackenzie Matis: Best: straight-shooter. Worst: deceptive.
Dr. Fernando De La Torre: Best: strong analytical abilities to weigh risk and reward. Worst: impatience and greediness.
Dr. Matt Uyttendaele: Best: curious. Worst: herd-mentality.
Dr. Karen Panetta: Best & Worst: decisive because once they make a decision it’s hard to convince them otherwise. However, it’s good too because once they believe in you, they tend to keep supporting/encouraging you.
Dr. Serge Belongie: Best: grace. Worst: arrogance.
Dr. Gaile Gordon: Best: creativity. Worst: arrogance.
Dr. Ramesh Raskar: ‘Moses-trap’ as defined by Safi Bahcall in Loonshots.
Dr. Nikhil Naik: Best: deep thinker. Worst: trend-chaser.
What are you most looking forward to at our 9th annual LDV Vision Summit, on March 28th?
Dr. Olivier Elemento: I look forward to hearing about cool technologies and new ideas in the field of vision. It’s such an exciting field, especially with the ultra-rapid developments in the application of AI to vision, from automated image analysis to the generation of entirely new images and videos based on text and speech.
Dr. Mackenzie Matis: Learning about all the cool advances since last year and networking with like-minded people! Can’t wait!
Dr. Fernando De La Torre: I am excited about attending all the talks and witnessing the progress in commercializing various computer vision technologies.
Dr. Matt Uyttendaele: The serendipity. I love the summit for the broad range of backgrounds and talents that are brought together. As a scientist, I can look at the schedule and pick a few of the deep tech talks that I know I’ll enjoy. But, I also know that there will be a VC or policy talk that will give me a new perspective that I wasn’t at all expecting.
Dr. Serge Belongie: Catching up with old friends and learning about the latest visual tech innovations.
Dr. Gaile Gordon: The Summit always combines great content from researchers, entrepreneurs, and investors – I look forward to being inspired!
Dr. Ramesh Raskar: Magic and moments of ‘How did they do that?’
Dr. Nikhil Naik: Learning from the fantastic set of speakers on how visual tech is changing the world!
Learn more about these brilliant researchers & professors
Dr. Lydia Chilton is an Assistant Professor of Computer Science at Columbia University. Her focus of interest is human-computer interaction. Her research views the design process from a computational standpoint. She is a part of the Computational Design Lab, a research group in the Computer Science Department of Columbia University, where she builds AI tools that enhance people’s productivity. Read our interview about AI-generated visual content for social media in the Women Leading Visual Tech series.
Two current projects are constructing visual metaphors for creative ads and using computational tools to write humor and news satire.
Chilton received her Ph.D. from the University of Washington in 2015. She received her Master’s in Engineering from MIT in 2009 and her SB In 2007, also from MIT. Prior to joining Columbia Engineering in 2017, she was a postdoctoral student at Stanford University.
Dr. Olivier Elemento is the Director of the Englander Institute for Precision Medicine (EIPM), Professor in the Department of Physiology and Biophysics, Associate Director of the Institute for Computational Biomedicine, and Associate Program Director of the Clinical & Translational Science Center.
Dr. Elemento led the development the first New York State approved whole exome sequencing test for oncology and also developed new methods for assessing tumor-driving pathways, the immune landscape of tumors and predicting immunotherapy responders. Moreover, Dr. Elemento developed methodologies to repurpose existing drugs to target specific pathways, predict drug toxicity and identify synergistic drug combinations combining Big Data with experimentation and genomic profiling to accelerate the discovery of cancer cures.
He has published over 150 scientific papers in the area of genomics, epigenomics, computational biology and drug discovery.
Dr. Mackenzie Mathis is a neuroscientist, a tenure-track professor at the Swiss Federal Institute of Technology, working within the Brain Mind Institute & Center for Neuroprosthetics. The lab is hosted at the Campus Biotech in Geneva, Switzerland, where she holds the Bertarelli Foundation Chair of Integrative Neuroscience.
Dr. Mathis founded the Adaptive Motor Control Lab to investigate the neural basis of adaptive motor behaviors in mice to inform future translational research in neurological diseases. Her team’s goal is to reverse engineer the neural circuits that drive adaptive motor behavior by studying artificial and natural intelligence. The researchers are using the latest techniques in 2-photon and deep brain imaging. They also develop computer vision tools, like DeepLabCut.
Before that, Mackenzie completed her doctoral studies and was a faculty member at Harvard University. Her work has been featured in Nature, Bloomberg BusinessWeek, and The Atlantic.
Read our interview about the interesting interplay between biological intelligence & artificial intelligence in the Women Leading Visual Tech series.
Dr. Fernando De La Torre is the Research Associate Professor at Carnegie Mellon University and the director of the human sensing laboratory. Hedevelops machine learning algorithms for modeling and describing human behavior using sensors (audio, video, etc.) in the human sensing laboratory, with applications in 𝗺𝗲𝗱𝗶𝗰𝗮𝗹 𝗱𝗶𝗮𝗴𝗻𝗼𝘀𝘁𝗶𝗰𝘀 (e.g., depression, Parkinson’s disease, hot-flash) and 𝐚𝐮𝐠𝐦𝐞𝐧𝐭𝐞𝐝 𝐫𝐞𝐚𝐥𝐢𝐭𝐲/𝐯𝐢𝐫𝐭𝐮𝐚𝐥 𝐫𝐞𝐚𝐥𝐢𝐭𝐲.Fernando works on the intersection of computer vision and machine learning, with an emphasis on building systems. Most of his team’s systems are learning-based, and they develop foundational algorithms in the areas of 𝐠𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈, 𝐫𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐥𝐞 𝐀𝐈, and 𝐝𝐚𝐭𝐚-𝐟𝐨𝐜𝐮𝐬𝐞𝐝 𝐜𝐨𝐦𝐩𝐮𝐭𝐞𝐫 𝐯𝐢𝐬𝐢𝐨𝐧 (focus on the data not the algorithm).
In 2014, Fernando founded FacioMetrics LLC, a company that specializes in face image analysis technologies. FacioMetrics LLC was acquired by Facebook/Meta in 2016. He spent four years at Facebook/Meta, where he led the efforts in mobile augmented reality. His team developed the technology for face tracking, background removal, hair coloring, hand tracking/detection, and more. Fernando also spent two years in Facebook Reality Labs working on the areas of digital humans for virtual reality, and contributed to the core algorithm behind face tracking in Meta Quest Pro.
Dr. Matt Uyttendaele is the director of the Core AI Group within Facebook AR/VR. His group does a mix of research and product development focusing on computer vision. Areas of work for the group include: vision on mobile devices, visual perception, deep neural nets, imaging algorithms, and computational photography.
Prior to Facebook, Matt worked at Microsoft Research and Bell Labs. At Microsoft, his research resulted in several apps including Hyperlapse, Photosynth, Blink, and ICE (Image Composite Editor) and product features in Windows, Bing, and Office. At Bell Labs, Matt focused on video processing, developing compression algorithms (MPEG, H.263) and VLSI implementations.
His research has been presented at SIGGRAPH, CVPR, and the IEEE International Solid-State Circuits Conference. Matt holds 62 US patents. He received an MS in electrical engineering from Rensselaer Polytechnic Institute.
At our 2018 LDV Vision Summit, Matt spoke about enabling persistent Augmented Reality experiences across the spectrum of mobile devices.
Dr. Karen Panetta is an electrical and computer engineer, inventor and Dean of Graduate Education for the School of Engineering at Tufts University. Her research areas include artificial intelligence, machine learning, automated systems, simulation and visual sensing systems. Karen develops signal and imaging processing algorithms, simulation tools and embedded systems for applications for robot vision and biomedical imaging applications. Check out our interview on the magic of engineering and computer science.
She has won a number of awards for excellence in research, social impact, teaching and mentoring, ethics, and engineering education. She is the recipient of the Presidential Award for Excellence in Science, Math and Engineering Mentoring from U.S. President Barack Obama.
Dr. Panetta founded the “Nerd Girls” program, which encourages young women to pursue engineering and science. Karen is the editor-in-chief of IEEE WIE Magazine and co-author of the book, “Count Girls In”.
Dr. Serge Belongie is a Professor of Computer Science at the University of Copenhagen, where he also serves as the head of the Pioneer Centre for Artificial Intelligence. Previously, he was a professor of Computer Science at Cornell University, an Associate Dean at Cornell Tech, and a member of the Visiting Faculty program at Google.
His research interests include Computer Vision, Machine Learning, Augmented Reality, and Human-in-the-Loop Computing. He is also a co-founder of several companies including Digital Persona and Anchovi Labs.
He is a recipient of the NSF CAREER Award, the Alfred P. Sloan Research Fellowship, the MIT Technology Review “Innovators Under 35” Award, and the Helmholtz Prize for fundamental contributions in Computer Vision. He is a member of the Royal Danish Academy of Sciences and Letters. Serge is also an Expert in Residence with us at LDV Capital.
Dr. Gaile Gordon has over 20 years of experience in computer vision-related products and R&D leadership. She has a proven track record transitioning technology from R&D to production in both enterprise and consumer markets. Dr. Gordon is the co-founder of TYZX, which produced hardware for accelerated 3D cameras and computer vision applications for a variety of markets including security, robotics, and automotive industries.
TYZX was acquired by Intel in 2012 and fueled their RealSense 3D technology products. Her product experience spans custom ASICs, firmware, software APIs, to end-user applications.
Gaile received a Ph.D. from Harvard, an MS and a BS from the MIT AI Lab. Gaile advises early-stage companies, is an Expert in Residence with us at LDV Capital, and is an active angel investor. Check out our interview “No perfect time to start a company”.
Dr. Ramesh Raskar is an Associate Professor at MIT Media Lab and directs the Camera Culture research group. His focus is on AI and Imaging for health and sustainability. These interfaces span research in physical (e.g., sensors, health-tech), digital (e.g., automating machine learning) and global (e.g., geomaps, autonomous mobility) domains.
He received the Lemelson Award (2016), ACM SIGGRAPH Achievement Award (2017), DARPA Young Faculty Award (2009), Alfred P. Sloan Research Fellowship (2009), TR100 Award from MIT Technology Review (2004) and Global Indus Technovator Award (2003).
Ramesh has worked on special research projects at Google [X], Apple and Facebook and co-founded/advised several companies. He took part in the panel discussion on computational photography at our 2018 LDV Vision Summit.
Dr. Nikhil Naik is Director of AI Research at Salesforce. He leads a team of researchers and engineers working on generative AI and its applications in natural language processing, computer vision, and biomedicine. Nikhil’s team has developed state-of-the-art AI models for language and image generation and deployed them in collaboration with large companies and university labs at Stanford, UCSF, and UC Berkeley, among others.
Nikhil has recently co-authored a paper on using large language models like ChatGPT for protein design. ProGen – the language model described in the paper – can generate protein sequences with a predictable function across large protein families, akin to generating grammatically and semantically correct natural language sentences on diverse topics.
At our 9th Annual LDV Vision Summit, Dr. Naik will speak about using large language models like ChatGPT for protein design. Tune in at 12:45PM ET to listen to his session – RSVP to attend!
# # #
The above article originally appeared on the LDV website on March 20, 2023.