Meet The Researcher: Alina Zare, Advancing Machine Learning and Sensing

‘Meet the Researcher’ is a series in which we feature different researchers in academia who are using GPUs to accelerate their work. This month we spotlight Alina Zare who conducts research and teaches in the areas of machine learning, artificial intelligence, computer vision and image analysis, and remote sensing.

Zare directs the Machine Learning and Sensing Lab at the University of Florida. She is a recipient of the 2014 National Science Foundation CAREER award for her work in Learning from Imprecise and Uncertain data.

What are your research areas of focus?

I develop machine learning and artificial intelligence methods to automate processing and understanding of remote sensing and (primarily, non-visual) sensor data. Applications I have studied include landmine and buried explosive hazard detection, automated above- and below-ground plant phenotyping, target detection, agriculture, and underwater scene understanding. I have developed algorithms to automate analysis of data from ground penetrating radar, lidar, hyperspectral cameras, wideband-electromagnetic induction, thermal cameras, synthetic aperture sonar, minirhizotron imagery, among others.

What motivated you to pursue this research area of focus?

I get excited about developing systems that are used in practice and have an impact. So, I have focused my research career on applied AI and developing AI approaches for practical application with a clear transition to practice. I focused on remote sensing applications because it is useful in many pressing areas of global and societal concerns such as agriculture and ecology. Also, I think the sensor technology available in remote sensing is just so cool. Using sensor systems like hyperspectral cameras, lidar and ground penetrating radar provide us the opportunity to make super-human AI systems that have capabilities beyond what we as humans can naturally accomplish with our vision systems.

Tell us about your current research projects.

I have several on-going projects I am really excited about.

For example, we are developing approaches to better study plant roots. As you can imagine, it is really hard to collect data about roots in realistic field conditions since they are underground! The plant science community has employed a lot of creative solutions to study roots including growing plants in clear gels in the laboratory, growing plants between narrow plates of clear plastic, carefully excavating root systems, or even using “shovelomics” in which plants are pulled out of the ground to see and measure the roots that get pulled up.

Also not hard to imagine, these methods have major drawbacks, they are either not in realistic field conditions and, so, any data collected is not indicative of how the root system would be in the field or they are destructive and do not allow for studying root systems as they grow and change over time.

To help tackle this challenge, we have been developing approaches to improve minirhizotron image analysis as well as to deploy backscatter X-ray systems or X-ray CT to image roots. In particular, minirhizotron tubes are clear plastic tubes that are inserted into the soil among the plant’s root system. Then, you can send a camera down the tube to take imagery of the roots that happen to grow alongside the tube. Minirhizotron tubes have a lot of advantages including the fact that you can image the same root system repeatedly over time. However, standard approaches to analyze these the minirhizotron images are to hand-trace each and every root – it is very tedious and time consuming.

We (and others) developed deep learning techniques to automate the processing of these images. However, to train deep learning architectures, thousands (or even tens-of-thousands) of hand-traced training images are needed. Even simply getting the training data is slow, tedious, and error-prone. So, we have developed machine learning approaches to detect and segment the roots in the imagery that learn from “weak” labels (on the image scale) on a sample training data set. In other words, users only need to mark example images as “containing roots” or “does not contain roots.” Then, our approach learns from these weak labels and the corresponding example training set how to detect and segment roots in the collected data. The trained method can be then applied to a large image set.

More information about this can be found in these papers: https://faculty.eng.ufl.edu/machine-learning/2020/08/weakly-supervised-minirhizotron-image-segmentation-with-mil-cam/ https://faculty.eng.ufl.edu/machine-learning/2019/03/4260/

My minirhizotron root phenotyping work is in collaboration with wonderful partners at the University of Florida (UF), University of Missouri, University of Texas – Austin, and Argonne National Lab and has been funded by ARPA-E, DOE and USDA.

Another research effort I am excited about is in the area of forest ecology to be able to help mitigate climate change, track biodiversity, manage invasive species, and understand the impact of land use changes on natural systems. However, mapping each individual tree and collecting measurements (like tree species, diameter, height) is expensive, time consuming, and essentially infeasible beyond a local scale. So, in this effort, our group is leveraging satellite imagery and aerial imagery collected from planes and unmanned aerial vehicles (UAVs) to pair with survey data and detect, delineate and measure trees at landscape or even continental scales using deep learning and other machine learning methods. We have already released our first data set containing 100 million trees https://zenodo.org/record/3765872#.X1P73C2ZPOR, the associated code used to delineate them (https://github.com/weecology/DeepForest), and a number of papers describing the approach (https://www.mdpi.com/2072-4292/11/11/1309 , https://doi.org/10.1016/j.ecoinf.2020.101061).

This effort is in collaboration with Ethan White, Stephanie Bohlman, Aditya Singh, Daisy Zhe Wang, and Sarah Graves as well as an excellent team of post-doctoral researchers and graduate students, Ben Weinstein, Sergio Marconi, Dylan Stewart, and Ira Harmon. Learn more about this effort at our project website: https://idtrees.org/

In terms of foundational AI research, in collaboration with my colleague Paul Gader at UF, we are working to improve the competency awareness in AI systems. Many of the very sophisticated AI approaches right now do not have the capability of knowing when they do not know the right answer. In other words, they are incapable of saying ‘I don’t know’ when appropriate. This can be an important capability when an AI system encounters unexpected inputs. For example, if you had an AI system analyzing medical symptoms, if it encounters an unfamiliar set of symptoms, it would be best to say ‘I don’t know’ rather than attempt to provide diagnosis guidance. Our approach leverages the null space associated with each layer of a deep learning network. The null space is the set of inputs that all produce the same output value (of zero) for a layer. In other words, the network cannot differentiate them and discards all information associated with the null space. So, we are monitoring the amount of data being mapped to the null space for each input and, when it differs from what is expected, we can flag the sample as being one unlike what the network can reliably process. So far, we have published one paper on this topic (https://faculty.eng.ufl.edu/machine-learning/2020/07/outlier-detection-through-null-space-analysis-of-neural-networks/ ) and advised a Master’s thesis by Srinivas Medapati (https://ufdcimages.uflib.ufl.edu/UF/E0/05/45/18/00001/MEDAPATI_S.pdf ) but are continuing work in this area.

One of my newer efforts I want to mention is in collaboration with Diane Rowland, Chris Wilson and Jose Dubuex at UF. One of our long-term goals is to develop AI systems that can inform managers of agricultural regions the best practices that enhance their goals (e.g., maximize profit, maximize crop yield, improve health of grazing animals) while taking into account the environmental and societal impacts of specific management decisions. This is particularly challenging problem because there are many interdependent and complex relationships that need to be understood and taken into account while determining the best course of action. For example, you could imagine a set of AI tools that advise on fertilization levels and timing that takes into account the impact on runoff to local bodies of water and other potential environmental impacts.

Finally, I am also collaborating with Tie Liu in Horticulture at UF. Our goal on this effort is to develop an imaging tool and set of AI algorithms that can predict when vegetables in a grocery store (or other venue in the food production pipeline) will “go bad” or lose “freshness.” It is very difficult to predict this right now until visible symptoms (i.e. the broccoli starts to turn brown) appear but at that point, the value of the vegetable had significantly degraded (in other words, it is too late at that point).

A more complete listing of the projects I am involved with and resulting publications can be found on my lab website: https://faculty.eng.ufl.edu/machine-learning/.

What problems or challenges do your research address?

As I mentioned, I focus on applied AI research tasks that address a practical application. There are a couple of research threads that are consistent across all of these application areas. One is the need for AI systems to be able to reliably handle unexpected inputs or outliers in a data set. Carefully curated data sets are wonderful to work with when developing AI systems, but in practice, you will often encounter something unexpected. For example, we have on-going research to estimate the maturity of peanuts using hyperspectral imagery as they go by on a conveyor belt. That system, once deployed, would need to be able to gracefully handle something other than peanuts accidentally ending up on the belt (https://www.sciencedirect.com/science/article/abs/pii/S1537511019308621 ). So, many of our research projects include some aspect of outlier detection or competency awareness embedded into our AI systems.

The second common thread is the need to have machine learning and AI systems that can learn from imprecise or uncertain labels. In many application areas, the ability to collect precise, well-labeled imagery is time-consuming, costly, and in some cases simply infeasible. To deal with this, we have for many years been developing new machine learning that can learn from a variety of labels with different levels of scale, accuracy and precision (for example, labeling on the image- or group-of-image scale instead of pixel-level while still obtaining pixel-level outputs) that are much easier, cheaper and faster to produce.

What is the (expected) impact of your work on the field/community/world?

My goal is for our methods to be put into practice and, so, we regularly share our code and implementations to enable this. For example, we have shared our automated root image analysis techniques with the plant science community and regularly help new users get it up and running. Our work in forestry is providing open data sets and code to ecologists. My aim is to have our research work not only advance machine learning and AI but also provide impactful tools and data to scientists and practitioners in the application area.

How have you used NVIDIA technology either in your current or previous research?

Most of our current work relies on deep learning architectures. We rely heavily on NVIDIA GPU systems (and, specifically, the new NVIDIA DGX-A100 system at UF) to efficiently training these metho machine learning and artificial intelligence methods to automate processing and understanding of remote sensing and (primarily, non-visual) sensor data. Applications I have studied include landmine and buried explosive hazard detection, automated above- and below-ground plant phenotyping, target detection, agriculture, and underwater scene understanding. ds on the large data sets we are working with.

Did you achieve any breakthroughs in that research or any interesting results using NVIDIA technology?

Yes! Our state of the art minirhizotron root segmentation approaches and our recently posted data set of 100 million delineated trees were both developed using NVIDIA technology. Nearly all of the students in my lab rely on NVIDIA GPUs to develop their methods and carry out their research.

What’s next for your research?

I really enjoy my cross-disciplinary collaborations to solve problems and develop technology that impacts other fields. I hope to expand this from only collaborations with other scientists and researchers to include collaboration with industry. For example, we are currently building up collaborations to start deploying the AI systems we develop in actual agricultural practice.

Any advice for new researchers?

Spend time working with others outside of your field. Some of my greatest successes and biggest joys in my career have come from these collaborations. Not only does this allow you to make impacts in both fields, but you also learn so much about the disciplines you are collaborating with. They provide new ways of thinking about problems that spur creativity and real breakthroughs.

This story was originally posted by NVIDIA.

Meet The Researcher: Alina Zare, Advancing Machine Learning and Sensing

Search UFII