STOP_AND_RESET_ENCODER sometimes doesn't stop motor?

pmichaud · January 3, 2024, 9:56pm

Hello all,

For several years our team has encountered (and today we can now reliably repeat/document) that sometimes placing a DcMotor into STOP_AND_RESET_ENCODER mode doesn’t actually remove power to (i.e. “stop”) the motor.

Here’s the bottom line image capture – note that even though the motor is in STOP_AND_RESET_ENCODER mode it’s drawing approx 7 amps.

Is this a known issue, or is there something going on that we should be aware of? The Javadocs seem to clearly state that the motor should have power removed as a side-effect of placing a motor in this mode.

The (straightforward) OnBotJava code we are using is attached as ElevTest.java . I made a video that demonstrates this in more detail at https://youtu.be/rkFPpIvgoFs .

My questions would be:

is this a known issue?
are we doing something wrong that causes this?
is there a standard workaround or code snippet to make sure that power to the motor is actually removed?

All comments and suggestions greatly appreciated!

Pm

Coach, FTC #7172 Technical Difficulties
PDP Emeritus, North Texas FTC

pmichaud · January 3, 2024, 9:58pm

Oops, I forgot to provide the code – I don’t see how to do attachments so I’ll paste it here (ElevTest.java):

package org.firstinspires.ftc.teamcode;

import com.qualcomm.robotcore.eventloop.opmode.LinearOpMode;
import org.firstinspires.ftc.robotcore.external.navigation.CurrentUnit;
import com.qualcomm.robotcore.hardware.DigitalChannel;
import com.qualcomm.robotcore.hardware.DcMotorEx;
import com.qualcomm.robotcore.eventloop.opmode.TeleOp;
import com.qualcomm.robotcore.eventloop.opmode.Disabled;
import com.qualcomm.robotcore.hardware.DcMotor;
import com.qualcomm.robotcore.hardware.DcMotorSimple;
import com.qualcomm.robotcore.util.ElapsedTime;

@TeleOp

public class ElevTest extends LinearOpMode {

    private DcMotorEx elev;
    private DigitalChannel elevLimit; 

    @Override
    public void runOpMode() {
        
        elev = hardwareMap.get(DcMotorEx.class, "elev");
        elev.setDirection(DcMotorSimple.Direction.REVERSE);
        elevLimit = hardwareMap.get(DigitalChannel.class, "elim");

        telemetry.addData("Status", "Initialized");
        telemetry.update();
        waitForStart();
        
        boolean aLast = false;
        boolean retracting = false;

        while (opModeIsActive()) {
            
            if (retracting && !elevLimit.getState()) {
                elev.setMode(DcMotor.RunMode.STOP_AND_RESET_ENCODER);
                retracting = false;
            }            
            
            boolean aThis = gamepad1.a;
            if (aThis && !aLast) {
                elev.setMode(DcMotor.RunMode.RUN_USING_ENCODER);
                elev.setVelocity(-1000);
                retracting = true;
            }
            aLast = aThis;

            telemetry.addData("elev.pos", elev.getCurrentPosition());
            telemetry.addData("elev.mode", "%s", elev.getMode());
            telemetry.addData("elev.current", "%.2fA", elev.getCurrent(CurrentUnit.AMPS));
            telemetry.addData("elevLimit.getState", elevLimit.getState());
            telemetry.addData("retracting", retracting);
            telemetry.addData("Status", "Running");
            telemetry.update();
        }
    }
}

Westside · January 4, 2024, 3:06am

Great job describing this issue!

TLDR: For a quick workaround, move RUN_USING_ENCODER before while opModeIsActive(). This worked in my version of your code, as described below.

============

Background

I was able to replicate the “rogue state” where Telemetry reports STOP_AND_RESET but the motor is still running. This was possible with a simpler version of your code, using gamepad button B instead of a digital limit switch (thus eliminating that potential source of sensor delay).

The rogue state arose only when the stop event happened very soon after the start event, which is true for your robot mechanism.

Your code also has another characteristic, that pressing the A button allows retraction to resume, after the limit switch was triggered. The code logic seems to preclude that, but it does happen.

In fact, a way to recover from the rogue state, is to press the A button again, which resets retracting to True, which allows the limit switch (my button B) to again call STOP_AND_RESET, with success.

I think there are three considerations here:

Do you want retracting and limit-stopping to be mutually exclusive? Namely start just once, then stop when limited? If so, there’s no need to catch the rising edge of Button A.
There is indeed a tricky timing issue with the RunModes, possibly related to loop cycle times or the state of logic variables. More on that below.
Further obscurity is provided by the FTC Javadocs which you mentioned. Far from being “clear”, here’s the full blurb (emphasis added):

STOP_AND_RESET_ENCODER
.
public static final DcMotor.RunMode STOP_AND_RESET_ENCODER
.
The motor is to set the current encoder position to zero. In contrast to RUN_TO_POSITION, the motor is not rotated in order to achieve this; rather, the current rotational position of the motor is simply reinterpreted as the new zero value. However, as a side effect of placing a motor in this mode, power is removed from the motor, causing it to stop, though it is unspecified whether the motor enters brake or float mode. Further, it should be noted that setting a motor to STOP_AND_RESET_ENCODER may or may not be a transient state: motors connected to some motor controllers will remain in this mode until explicitly transitioned to a different one, while motors connected to other motor controllers will automatically transition to a different mode after the reset of the encoder is complete.

==========

Testing

Here was my first version, replacing your limit switch with Button B. Allows testing with just a DC motor, not a full robot.

v01

if (retracting && gamepad1.b) {
     motor.setMode(DcMotor.RunMode.STOP_AND_RESET_ENCODER);
     retracting = false;
}            

boolean aThis = gamepad1.a;
if (aThis && !aLast) {
     motor.setMode(DcMotor.RunMode.RUN_USING_ENCODER);
     motor.setVelocity(-1000);
    retracting = true;
}
aLast = aThis;

I then tried other versions to address possible underlying issues of:

code logic
loop timing
human response (button release)

The following versions did not eliminate the problem. For all of these, a fast sequence of A, then B, could generate the rogue state. This was true whether or not A was physically released before pressing B.

v02

 boolean limitReached = false;

 while (opModeIsActive()) {

     if (!limitReached) {    

           if (gamepad1.b) {
                 motor.setMode(DcMotor.RunMode.STOP_AND_RESET_ENCODER);
                  limitReached = true;
           }            
         
           else if (gamepad1.a) {
                 motor.setMode(DcMotor.RunMode.RUN_USING_ENCODER);
                 motor.setVelocity(-1000);
           }

    }     // end if !limitReached

    // telemetry

}    // end while

v03

boolean limitReached = false;

while (opModeIsActive()) {

       if (!limitReached && gamepad1.b) {
              limitReached = true;
              motor.setMode(DcMotor.RunMode.STOP_AND_RESET_ENCODER);
       }            
            
      else if (!limitReached && gamepad1.a) {
              motor.setMode(DcMotor.RunMode.RUN_USING_ENCODER);
              motor.setVelocity(-1000);
      }

    // telemetry

}    // end while

=========

Separation

I ending up suspecting the culprit was a fast succession of initiating RUN_USING_ENCODER and STOP_AND_RESET. Sure enough, this always works as intended:

v04

boolean limitReached = false;

motor.setMode(DcMotor.RunMode.RUN_USING_ENCODER);

while (opModeIsActive()) {
            
      if (!limitReached) {
            
           if (gamepad1.b) {
                limitReached = true;
                motor.setMode(DcMotor.RunMode.STOP_AND_RESET_ENCODER);
          }            
            
           else if (gamepad1.a) {
                    motor.setVelocity(-1000);
           }
            
     }   // end if !limitReached
            
    // telemetry

}    // end while

No sequence of fast or overlapping inputs could generate the rogue state.

Lastly, I applied this to the first version that replicates your code, and it also works:

v05


boolean aLast = false;
boolean retracting = false;
motor.setMode(DcMotor.RunMode.RUN_USING_ENCODER);

while (opModeIsActive()) {
            
      if (retracting && gamepad1.b) {
            motor.setMode(DcMotor.RunMode.STOP_AND_RESET_ENCODER);
            retracting = false;
       }            

       boolean aThis = gamepad1.a;
       if (aThis && !aLast) {
              motor.setVelocity(-1000);
              retracting = true;
       }
       aLast = aThis;

    // telemetry

}    // end while

===============

Conclusion

Perhaps there’s no “bug” per se with a single RunMode, but instead there are rare interactions to document and avoid, eventually to improve if deemed essential.

pmichaud · January 4, 2024, 4:27am

Thanks for the quick analysis and response. I have a ton of questions now, which I can’t immediately test on the team’s robot.

For the three considerations you mention, here are some quick responses:

(rising edge of button A) I added the rising edge detection in this example so that there would be little question of when and how many times elev.setMode(RUN_USING_ENCODER) and elev.setVelocity(-1000) get invoked – they get invoked exactly once on the rising edge of A, and not again. You correctly note that the code doesn’t guard against long presses of A, but the video clearly shows (e.g. by single-stepping frames) that the A button has been fully released long before the limit switch is reached, so this shouldn’t be a situation of the A button still being depressed when the limit switch is reached.
(Tricky timing issue) – more on that below.
(Obscure Javadocs w/ emphasis) – yes, the part about “automatically transition to a different mode” bugs me – if the controller is automatically transitioning to a different mode after reset is complete, I’d expect that automatic transition to be reflected in .getMode() as well. That’s not happening here – if there’s an automatic transition taking place then it’s happening entirely internally to the controller and the DcMotor(Ex) instance still reports STOP_AND_RESET_ENCODER from .getMode(). That’s… concerning somehow.

And also, a question:

4a. (new) Does .setVelocity() automatically switch the RunMode to RUN_USING_ENCODER? (I don’t have access to the robot at the moment to test this myself.) If it does, does this implicit mode change not trigger an equivalent “fast sequence” of RUN_USING_ENCODER followed by STOP_AND_RESET_ENCODER that could trigger the bug? In other words, does the problem come down to the explicit .setMode(RUN_USING_ENCODER) inside the loop, and the problem goes away entirely if that’s not present?

4b. What happens in your v04 and v05 if there’s no elev.setMode(RUN_USING_ENCODER) in the program at all? (i.e., always setVelocity to “implicitly” set it)

As far as timing goes, how “fast” does “fast sequence of A, then B” is “too fast”? If I single-step through my video, there are at least 9 frames from when the elevator starts moving until the limit switch turns on (at approx 1m42s in the video).

For my 30fps video, 9 frames means at least 300ms from the time that the motor started with RUN_USING_ENCODER until the limit switch was triggered and STOP_AND_RESET_ENCODER occurred. I tend to think of 300ms as being a fairly significant amount of time in robot terms, not “fast” at all. (As I describe robot time/speed to my team members: 300ms for a 60in/sec robot is 18 inches, which is not insignificant.)

Put another way, at least two separate telemetry updates occur (one updates to 1.68A, another to 3.14A, I assume at least 250ms apart) before STOP_AND_RESET_ENCODER appears.

So if 300ms is too fast for a (explicit?) motor mode change, how much time do we need to allow? That really needs to be quantified and documented somewhere, I fear.

[This all becomes very problematic when trying to implement something like manual control of an elevator, where you do often need to switch between modes depending on whether you’re moving to a position (automatic), moving manually (run using encoder/velocity), or resetting based on a limit switch.]

I’ll try to do some more tests and questions later, but those are my initial thoughts and questions. Thanks again for investigating this so thoroughly thus far!

Pm

pmichaud · January 4, 2024, 7:39am

So, I managed to get time with the team robot again and was able to make another video. https://youtu.be/bbklduMGPxI if you want to jump straight to the video.

I tried the suggestion of moving the RUN_USING_ENCODER to before the while(opModeIsActive()) loop, and it works, but only for the first/initial retraction. Subsequent retractions still have the problem of leaving the motor in a powered state, because .setVelocity() automatically/internally changes the mode to RUN_USING_ENCODER if the current mode is not already one of RUN_USING_ENCODER or RUN_TO_POSITION. (I found a post from Noah @ REV that confirms this.)

So, unless we happen to already be in one of those two modes when the A button is pressed, .setVelocity gives an implicit .setMode(RUN_USING_ENCODER) command which again ends up messing up the STOP_AND_RESET_ENCODER mode when the limit switch is reached.

The video also demonstrates our more general case of using RUN_TO_POSITION to extend outward to a known position, but then using RUN_USING_ENCODER to retract back to the limit switch. Many people will say that we should just use RUN_TO_POSITION to return to the zero position, but we feel that makes some undesirable assumptions: (1) that the zero position has already been properly set and maintained, (2) there hasn’t been any belt/gear/string slippage or encoder issues that might cause the zero encoder position to be reached before the limit switch is reached, (3) an automatic slow down of the motor/PID control as the encoder nears the zero position.

In theory I suppose we could simply RUN_TO_POSITION with .setTargetPosition(Integer.MIN_VALUE) to do a full run back to the limit switch (hopefully the encoder never gets anywhere near MIN_VALUE) and then do the STOP_AND_ZERO when the limit switch is reached. But I think there’s still the possibility that a .setMode(RUN_TO_POSITION) occurs too close in time to the STOP_AND_ZERO_ENCODER mode change and will leave the motor powered.

Comments welcomed, and I’ll keep investigating.

Pm

pmichaud · January 4, 2024, 8:51am

Apologies for the multiple posts in a row, but I keep finding more details as we investigate further.

I ended up asking myself the question… how fast is “too fast” for a mode switch? So the program got changed to use a timer instead of a limit switch or a button to stop the retraction:

        boolean aLast = false;
        boolean retracting = false;
        ElapsedTime timer = new ElapsedTime();

        while (opModeIsActive()) {
            
            if (retracting && timer.milliseconds() > 475) {
                elev.setMode(DcMotor.RunMode.STOP_AND_RESET_ENCODER);
                retracting = false;
            }            
            
            boolean aThis = gamepad1.a;
            if (aThis && !aLast) {
                elev.setMode(DcMotor.RunMode.RUN_USING_ENCODER);
                elev.setVelocity(-1000);
                retracting = true;
                timer.reset();
            }
            aLast = aThis;

            // telemetry
        }

With the timer set to expire at 470ms and 475ms, STOP_AND_RESET_ENCODER failed to turn off the motor 10 out of 10 times tested (100%). When set to 480ms it failed 5 out of 10 tests, at 490ms it failed 3 of 10 tests, and at 495ms it failed 2 of 10. At 500ms I didn’t observe any failures.

So, it appears that STOP_AND_RESET_ENCODER needs to occur at least 0.5sec after a mode change to RUN_USING_ENCODER, otherwise the motor can be left in a powered state. If done on a mechanism with a natural mechanical limit this might just drain batteries and heat up motors; but for some mechanisms it can result in mechanical or gameplay failures.

For the robots our team typically designs, 0.5 seconds is considered a really long time. In this year’s game, we really want that elevator mechanism to be fully retracted before we get to the Stage Door or Height Limit bars, and we can easily get to those from the Backdrop in less than 0.5 sec. And a motor velocity of 1000 ticks/sec isn’t all that fast – most of the common FTC motors can hit 2500 ticks/sec easily (especially if gravity assisted as would happen in the case of retracting something like an elevator).

So, we’ll keep searching for other answers. I’m guessing our answer will be to avoid STOP_AND_RESET_ENCODER mode altogether, and do our own zero-position processing with an offset value or something like that. Our programmers have always really preferred it when the built-in encoder value matches the actual zero point, but this probably pushes them away from that.

The inability to quickly switch modes in DcMotor might also explain other issues we’ve observed on our robots. In both last season and this season we’ve had instances where a driver would request a movement of some sort, and there’s a noticeable lag before the movement occurs (or sometimes not at all), even though telemetry shows that the motors were commanded to change. We chalked these up to programming or driver discrepancies, but now I’m wondering if changing RunModes is a culprit. We’ll have to do more testing.

I would like to see the documentation updated to note that STOP_AND_RESET_ENCODER might not de-power motors if it occurs too close in time to another mode change. This discovery explains a lot about why we’ve had so many motors overheat on our robots, because what we thought should be idling/depowering our motors was in fact drawing a significant amount of power.

Thanks for listening, for replicating/confirming what we were observing, and the terrific suggestions/pointers of things to try next.

Pm

P.S.: Perhaps this thread should be logged as a GitHub issue for FTCRobotController? I’ll be happy to do that if it’s appropriate.

P.P.S.: Now logged as GitHub issue #899 , apologies if that’s not appropriate.

pmichaud · January 5, 2024, 1:06pm

Surprisingly, adding a simple .setPower(0) before the while loop seems to make the problem go away entirely:

        boolean aLast = false;
        boolean retracting = false;
        ElapsedTime timer = new ElapsedTime();

        elev.setPower(0);
        while (opModeIsActive()) {
            
            if (retracting && timer.milliseconds() > 475) {
                elev.setMode(DcMotor.RunMode.STOP_AND_RESET_ENCODER);
                retracting = false;
            }            
            
            boolean aThis = gamepad1.a;
            if (aThis && !aLast) {
                elev.setMode(DcMotor.RunMode.RUN_USING_ENCODER);
                elev.setVelocity(-1000);
                retracting = true;
                timer.reset();
            }
            aLast = aThis;

            // telemetry
        }

In fact, it appears that doing any .setPower() call while the motor is in RUN_WITHOUT_ENCODER mode causes STOP_AND_RESET_ENCODER to work as expected from then on, even if the motor is subsequently placed into RUN_TO_POSITION or RUN_USING_ENCODER modes and uses .setVelocity().

I have no theories for why a simple .setPower() in RUN_WITHOUT_ENCODER mode would “fix” the unwanted interactions between STOP_AND_RESET_ENCODER, RUN_USING_ENCODER, and RUN_TO_POSITION that occur if the .setPower() isn’t executed. It’s exactly these sorts of “spooky interactions behind the API” that keep me awake at night – since there’s no obvious explanation of why this can make things suddenly start “working”, there’s little way of knowing if there are other conditions in which it starts failing again.

But at least there seems to be a plausible workaround for now.

Performing an explicit .setVelocity(0) or .setPower(0) immediately prior to the .setMode(STOP_AND_RESET_ENCODER) line also seems to work around the problem. That makes a little more sense to me as a workaround. For the most reliability we’ll probably advise our programmers to do both workarounds, and keep a careful eye on our motor current draws.

Thanks to everyone who helped with this analysis!

Pm