If you get the message "No WeBWorK course was found associated to this LMS course" it does not necessarily mean that the JWT is not being decoded. That is just one possibility. Have you enabled the general debugging facility in the conf/webwork2.mojolicious.yml file? What does the debug log show?
Your debugging log is showing the same thing that another debugging log showed for someone else that I was helping. The debugging log abruptly ends on the launch request and does not complete correctly. The issue in that case was a load balancer. We never actually figured out precisely what the issue was though. Basically, the initial login request was going to one node of the load balancer, and then the launch request to another, and the two nodes were not synchronizing correctly. Every once in a while, we got lucky, and both requests went to the same node, and then the launch was successful.
Do you have a load balancer in the picture that might be causing the same problem for you?
Thanks for the reply. There is load balancing on WeBWorK, but only one node. It's an AWS server, so there needs to be an Application Load Balancer. I would not be surprised if Moodle has more nodes.
I guess the lesson is, if you're setting up WeBWorK on a new server, check that the clock synch's properly.
Actually, looking at this closer I realized that the "old clock sync" issue I was thinking of only applies to LTI 1.1. Although LTI 1.3 has a different time sensitive set up. Note that the $LTI{v1p3}{stateKeyLifetime} defaults to 60 seconds, but can be increased if needed. Of course, it is better to synchronize clocks better as you did, but if network lag is enough that would be another reason to need to increase the setting.
So now I'm curious: I was having similar issues (which we thought was the load balancer). Could it be this same problem?
I don't think this ever would have occurred to me, and I don't know how one would diagnose this as the issue.
For that matter, if it is the issue, I'm not sure how to fix it! I'll do some checking tomorrow. I assume the time sync is done on the WeBWorK side, because I don't have access to the Moodle server.
In debug.log, I get the following tail, after a very long id_token string:
[Thu Aug 14 15:45:22.244421 2025] (eval): --------------------------------------------------------------------------------
[Thu Aug 14 15:45:22.444327 2025] (eval): Here's the course environment: WeBWorK::CourseEnvironment=HASH(0x5dcdc2ca8690)
[Thu Aug 14 15:45:22.444940 2025] (eval): Using authentication module WeBWorK::Authen::LTIAdvantage: WeBWorK::Authen::LTIAdvantage=HASH(0x5dcdc2ec6428)
[Thu Aug 14 15:45:22.445303 2025] WeBWorK::ContentGenerator::LTIAdvantage::launch: Failed to decode token received from LMS: JWT: iat claim check failed (1755207925/0 vs. 1755207922) at /opt/webwork/webwork2/lib/WeBWorK/ContentGenerator/LTIAdvantage.pm line 383.
Try adding
leeway => 10,
in the %jwt_params hash defined on line 362 of lib/WeBWorK/ContentGenerator/LTIAdvantage.pm. If that fixes the problem, then I will put in a pull request that adds an option for the value of the leeway there. The default leeway used by the Crypt::JWT module is 0, and you are showing iat values that differ by 3. Note that 1755207925/0 means that the iat in the payload is 1755207925 and the leeway is 0. The /0 is added in the error by the Crypt::JWT module.
After looking at this closer I see that this is a clock synchronization issue. The iat in the JWT sent by the LMS was 1755207925 which was 3 seconds in the future relative to the current time of 1755207922 on your webwork2 server. That means that the clock on the LMS server is at least 3 seconds ahead of the clock on your webwork2 server. In order for the iat (and exp) value in the JWT to be considered valid they are expected to be before the current time on the webwork2 server. The "leeway" is the number of seconds that the iat and exp values are allowed to be in the future relative to the current time on the webwork2 server. So setting that to something like 10 as I mentioned should resolve the issue that you are having. You might see which server is off and see if the clock on that server can be synchronized. Although 3 seconds is not a large difference, so using the leeway is probably an acceptable solution in that case.
After 10 consecutive successful attempts, I am ready to conclude that this works!
Now I am sitting here laughing at the fact that such a simple thing has caused me two years of headaches.
There isn't a time sync being done between the LMS and WeBWorK. Typically each server would sync time with a network time protocol (NTP) server. If a server isn't regularly syncing with an NTP server then the system time can drift, which causes problems with certain activities between servers where timestamps are involved (LTI is one example. Another one I've run into in other contexts is file creation and permissions on network file shares).
In your case (in addition to the new pull request that Glenn referenced), you can check if the server time on your WeBWorK server matches the Moodle server. WeBWorK displays the time at the bottom of each page, so that's easy to verify. I'm sure that there are activities in Moodle that record a timestamp, so you would just have to run one of those and see if the time matches what you see from WeBWorK.
A forum post in Moodle, and opening a page in WeBWorK, suggests that time sync isn't an issue.
(Moodle doesn't display seconds, but we're at least at the same minute on both.)
I'll see if I can find time to test Glenn's PR this afternoon.