New datasets and, "hey! why does the tribar keep dropping?"

pietro · June 24, 2019, 2:00pm

Dear Catchers,

Many of you have expressed appreciation for the “tribar” - the horizontal bar that makes three passes showing stages of annotation of the current dataset (yellow=analyze, green=validate, blue=verify).

We recently completed the “Long Term” dataset, which was investigating the question of how late in the progression of Alzheimer’s disease (AD) can stalls be reversed and still restore memories and other cognitive functions. The data you analyzed are now going back to the Lab for review and we hope to get a preliminary result based on that analysis to share back with you soon.

We now have a new dataset to analyze (“VEGF”), but we received it only a few days before the “Long Term” set was complete, so we didn’t have enough time to preprocess all of it. When we receive data, it comes in the form of “Image Stacks”, which are basically huge 3D images. We then use custom software to remove motion artifacts, block out regions that don’t matter, find all the vessel segments, draw outlines around them, and then render the movies that are shown on Stall Catchers (SC). Depending on the dataset and image quality, a typical image stack results in about 750 movies.

Even though computers are very fast, this entire process takes a long time, and indeed, even though we have already uploaded many VEGF movies to SC, there are many others still being created now, which we will upload into SC soon. The issue is that the tribar shows progress on analyzing all the current data in SC. That means if we uploaded half of the VEGF dataset and the tribar shows the yellow bar at 50% - that means 50% of what has already been uploaded (not the whole dataset). So when we upload the rest of the dataset, that 50% will suddenly represent only 25% of the whole dataset, and the tribar will drop to reflect that.

The current situation:
At the moment of this writing, there are 65,128 movies from the VEGF dataset in SC. We think there will be a total of at least 130,000 movies to analyze. For the VEGF dataset, there are 235 image stacks. So far 219 image stacks have been converted to movies. The number of movies has ranged from ~75 to ~1500 per image stack. The average number of movies per image stack in VEGF so far has been about 600. So we expect to ultimately have 141,000 movies total. Because 141,000 is about double the current number of uploaded movies, we expect the tribar to drop by about half when we upload the rest later today.

In ideal circumstances, we would activate ALL the movies for a new dataset at once, but in situations like this, where we completed the old dataset and we don’t want to spin our wheels reanalyzing old data (even though that still has research value), we think it’s best to upload even a small portion of a new dataset right away. I hope this explanation (however long-winded) helps convey why the tribar will drop suddenly as we add the rest of the data for a new dataset, but please let me know if anything is unclear.

Thanks for catching!

Best wishes,
Pietro

Tacitsci · February 29, 2020, 11:26pm

Pietro, can you please explain the meaning of the three stages analyze, validate, verify?

seplute · March 3, 2020, 12:24pm

Hi @Tacitsci, it (the “tribar” and it’s phases) is explained in this blog post : What's the new multicolor bar on Stall Catchers?