Unable to drag to change the bounding box range if labels total amount reaching 10

FTC18227 · October 27, 2021, 3:18am

We love the video frame auto extraction feature and the OpenCV assistance of label tracking. It simplified the heavy lifting of data tagging.

However, we believe there’s a limit of 10 labels in total per frame. And when it’s reaching 10 labels, we cannot drag any exiting bounding box from top left nor bottom right to fine tune the range on the fly when paused. We believe it’s a potential UI bug.

lizlooney · October 27, 2021, 3:46am

Oh, that’s a good bug. Thanks for the report.

-Liz

ddiaz · October 27, 2021, 3:55am

You are correct, there is a limit of 10 bounding boxes within the frame. To see all of the limits that we currently have, see Section 7.2 part 8 in the manual, I tried to list all of the limits that I could put my finger on. Thanks for the bug report, not being able to fine-tune existing bounding boxes once you reach the limit was not intended.

-Danny

lizlooney · October 27, 2021, 4:01am

You can fine tune existing bounding boxes by adjusting the numbers in the table on the right-hand side.

uvidyadharan · October 27, 2021, 4:21am

One thing to note about using the the table to adjust values is that as of now it allows you input coordinates that are not on the image. For example in a 1920x1080 video you are currently able to select (4000, 4000) as a coordinate for the bounding box. This is not an actual point on the image but it would only cause you errors later. This is a known bug that has been reported. For now just be careful not to enter points that don’t exist.

-Uday (Team 7350)

FTC18227 · October 27, 2021, 4:55am

thank you Liz this is a good workaround so far

uvidyadharan · October 27, 2021, 5:14pm

Something I noticed when looking at your training frame is that the end result might not work out. The way TensorFlow Object Detection(TFOD) works your input images are actually scaled down to in the case of FTC-ML either 320x320 or 640x640. You will most likely end up using 320x320 because 640x640 will give an unbearably low fps. If you can imagine shrinking this image down to 320x320 you can see how this is just going to turn into a blurry yellow mess. While I can’t say for sure I highly suspect that this will mean that detection will be at best suboptimal.

-Uday (Team 7350)

FTC18227 · October 27, 2021, 5:38pm

Thank you Uday. Before using FTC-ML, we’ve trained our own customized model in Google Cloud with an accuracy around 83%, where we also noticed input down-sampling would impact the results.

We’ll probably compare the results with another video sample center cropped by 320x320 maybe. Only waiting for the training to be available…

btw: any hint on which TF model is used for the training? In previous version of ML toolchain we noticed some MobileNet-SSD base model was provided. Our team evolved that to faster models and we believe higher fps should enable more potential than a static image for just randomization.

crabigator · October 27, 2021, 8:10pm

3 posts were split to a new topic: TensorFlow Training Models

crabigator · October 29, 2021, 4:44am

This should be fixed in v0.1.1.