(me)Hi, I'm Andrew Ganse. I'm an applied physicist and data scientist, lately building out machine learning and data analysis systems for physical sensor data at Seattle area startup businesses. Whether in the commercial world or in the academic science world, the focus of my work has been in machine learning, inverse problems, optimization, signal processing, and data analysis. (My PhD as well as much of my work at APL-UW concerned inverse problems in geophysics). Here on research.ganse.org are some of my publicly shareable research results and tools, both in data science topics and in applied physics topics. You can contact me at andrew@ganse.org.


DEPLOYABLE CV DEEP LEARNING EXAMPLES BASED ON MEDICAL IMAGING
Screenshot 2024-03-24 at 3.58.22 PMLet's explore some detection problems in medical imaging based on some public datasets and MLFlow's Projects functionality. A self-contained modeling module is trained, has its performance logged in MLFlow, and is able to be checked out as a deployable model image. There's a configurable implementation of this in my aganse/py_tf2_gpu_dock_mlflow repo. Let's try the malaria detection dataset from the Tensorflow datasets, which contains a balanced, labeled dataset of about 27,000 thin blood smear slide images of cells, and let's see how well we can detect malaria parasite presence in the images. This dataset is used to train/test different variations of image classification models, including VGG-16 and various sizes of more basic convolutional networks.



GETTING MLFLOW+DATABASE RUNNING QUICKLY VIA DOCKER
mlflow_screen_shotThis provides a get-running-quickly Docker-compose setup using containers for MLflow, PostgreSQL, and NGINX. Run MLflow's database in PostgreSQL, and put an NGINX reverse proxy in front of the MLflow website to allow some level of access restriction (say for a workgroup within an already-firewalled company intranet).



DBSCAN CLUSTERING IN DECRYPTING AN IMAGE CYPHER
fowl_cypherThis wonderful kids' book series is fun not only for the stories themselves, but also because each of the first several books involves a cipher puzzle with "fairy hieroglyphics" - I love code puzzles! In the electronic form of the books I discovered the hieroglyphic sequence was moved to the back of the book, all perfectly lined up in matrices over a few pages at the end. And I thought, hey that seems like it'd be easy to parse and decrypt on a computer, just like the main character did!



ELECTROMAGNETIC INVERSION OF ESTUARINE SALINITY STRUCTURE USING SMALL-SCALE CSEM
[csem fig]The Conductivity Profiler is an instrument for remotely observing estuarine salinity profiles via electromagnetic measurements. Electromagnetic (EM) waves are attenuated in seawater as a function of frequency, and conductivity structure (closely related to salinity structure) in the water can be inferred by combining measurements of EM waves at different frequencies on a distant electric field receiver. Geophysical inversion methods are applied to estimate the estuarine salinity profile from the EM measurements. Using inverse theory techniques, we take advantage of statistical rigor and let the data determine the structure of the conductivity profile and quantify the uncertainty and resolution of the salinity profile.




CREATING A GPT CLIENT WITH PARAMETER CONTROL & WEBLINK SUBMISSION
I have found OpenAI's GPT models to be fabulously productive tools and use them often in my technical work now. But to get what I want out of the models for my uses has taken accessing the models from the API rather than the ChatGPT website GUI. GPT exampleThis allows me to change some of the model parameters, format the output as I wish, and run the whole thing in my terminal. Of course the process of making the app has provided highly useful education in understanding how the models work as well, including how interacting with them via API can enable no end of use cases from other automated code.



PREDICTING BANK LOAN BEHAVIOR WITH RANDOM FOREST MODELS
Bank Loan PredictionLet's implement a random forest classifier from Scikit-Learn to see how well we can predict whether a bank client will have good loan behavior (meaning they won't default or become delinquent) if they are given a new loan. We'll use a public bank transactions/loans dataset from the PKDD99 Challenge conference for the modeling. In the process we'll fit and explore the assumptions made for this model, and learn about some limitations of Scikit-Learn's tree-based models.



INTERACTIVE GPS DATA VISUALIZATIONS IN PYTHON/JUPYTER
[gps/map fig]Did you know you can plot your geographic data on interactive maps embedded directly in your Python notebooks? Check it out, as we play with and analyze some GPS tracking data. A database of tracked walking routes data available on a health/fitness website provides a convenient trove of data not only to play with, but also to explore the geometric interference effects of downtown buildings upon GPS track solutions.


RADIO SCIENCE GRAVITY INVERSION FOR ICY MOON INTERNAL STRUCTURE
[NASA fig]The nature of an icy satellite's interior relates fundamentally to its composition, thermal structure, formation and evolution history, and prospects for supporting life. Gravity measurements via radio Doppler information during spacecraft flybys are an important tool used to infer gross interior structure of these moons. Liquid water and ice layers have previously been inferred for the interiors of Jupiter's icy satellites Europa, Ganymede, and Callisto on the basis of magnetic field measurements by the Galileo probe, and on Europa and Callisto induced magnetic field signatures measured by the Galileo probe provided strong evidence for an ionic aqueous ocean. We apply geophysical inverse theory tools to assess the icy moon's interior density anomaly distribution that could be estimated from radio Doppler measurements, to support the search for mass anomalies in the ice shell (meteorites or diapiric upwellings) or near the H2O/rock interface (seamounts).