How to resume evaluation and let TF label the image?

The job status of evaluation shows as “stopped”. Can we resume it to complete it?

And when I click “image” tab, it shows the bounding box labeled by me, but not showing bounding box labeled by tensor flow. Something wrong here?

I downloaded the tflite file and used it in example opmode, it didn’t do the object detection. (I already updated the path for asset and pushed it to the right folder on the target).

Btw, here is our training: model_uuid=b5ecc2ce9e0e462598fdca19bf3308de

Thanks for the help!

The job state for evaluation as “Stopped” is normal, your training job in general completed and succeeded for the number of steps you requested. Everything looks “normal” - I think this is a red herring to whatever problem you’re actually seeing.

However, I’d have to see your model results to understand the rest of your statement. If you tell me your team number and give me permission to look at your account, I can access your account from the backend and see if I see anything that might point to the real problem.


Our team is 11212.
How to give you permission? Need your email?


No, you just need to say, “I solemnly swear I’m up to no good” and I can access your account with my magic administrator wand. I just can’t use it without your written permission (which saying so in this forum is enough).


Sure, please free free to access it. We need help there.


Sorry, can I speak publicly about what I found in your training data, or would you prefer I send comments to an email address? I did not find anything wrong with the way the model was processed or handled by TensorFlow, all of the issues were with inconsistent labeling of the video you provided that led the model to be unable to correctly train for the elements you were trying to label.

Sure. It is fine to comment publicly.
Can you be more specific on “inconsistent labeling”?

Yes! Thank you for letting me comment publicly, I think this could help a lot of people.

The first item I wanted to talk about is the video orientation. It appears that you took this video with a cell phone - that’s fine, and you even took the video in a reasonable resolution, so well done. However, you took this video in portrait mode - on the robot the video will likely never be in portrait mode (unless you know how to ask the camera to rotate the image before presenting it to Vuforia). It makes a big difference because of the downscaling that will occur - in order to be processed by the base model, the image is going to be downscaled to a 300x300 pixel square. Things look very different in portrait vs landscape when this kind of downscaling occurs - images will look “stretched” in the short direction vs “elongated” in the long direction because of this downscaling. The model doesn’t know “which end is up” so the effect can be detrimental to the model. Always make sure you take the video in the same orientation that the camera on the robot will be seeing things.

Let’s again talk about downscaling. On a 1280x720 image that gets downscaled to 300x300, an object that is 140px x 115px gets downscaled to 32px x 27px - that’s what the model itself will see. If you have very small objects in the image, it’s dubious whether or not the model is going to be able to accurately train and identify those objects when they’re that small scaled down. All hope isn’t lost, though, just understand that object recognition might suffer. You want the largest objects in the image that you can, and you want a lot of variation in the size of the object. If the object could be seen as 300x400 pixels, make sure some of your training frames have the object at 300x400 pixels in the training data (and a good bit of variation in-between).

Ok, let’s talk now about labeling, and specifically “inconsistent labeling.” One of the mantras you have to take with machine learning is “Garbage in, Garbage out.” If you provide bad labeling, the model is going to be bad. Let’s take a couple examples:

Here’s a small montage of how you labeled “Top” - when labeling objects, you want to always provide consistent labels that always shows the object consistently within the bounding box with a little room on each side for background. Frame 1 had a great label for “Top”, but very very few frames had reasonable labels afterward (bounding box only covers a small percentage of the object you want to track, includes other objects that might be classified differently, or completely missed the mark altogether). If you want the model to consistently know what a “Top” is, it needs to be consistently labeled. If you’re using a tracker you need to always watch that tracker and stop tracking when it goes off the rails, fix it, and resume.

It’s also important to recognize that frame 301 had no labels, but it was not marked as an ignored frame. When you look at the Datasets, frames that are not marked as ignored but have no labels are known as “Negative” frames. Negative frames should NEVER have objects in them that you want to identify - they are only used as providing “ignore everything in this frame as background” to the model training. Frame 301 is considered a “Negative” frame, but had one of the best images of the “Top” that you could provide, which likely confused the neural net so that it didn’t know what it should be looking at. Be careful when you have Negative frames - make sure they are intentional!

So now let’s look at a small montage of how you labeled “base”:

Again, inconsistent labeling. You started and finished strong, but too many of the frames in-between are a hot mess (excuse the phrase). You didn’t have many frames with a Base in them to begin with, so every improperly labeled frame hurts your detection massively!

But what I noticed more is what you didn’t label. How many bases are in the image? do you eventually want the model to label this as a base?


If so, not labeling them in every frame tells TensorFlow “ignore anything that looks like this.” The problem with that is that some of the images you “ignored” look a LOT like the ones you DON’T want to ignore (some of these ignored areas look a lot like a “base” in frames 687 and 203).

I hope this helps you!


Thank you! Will relabel and test again.