Author Topic: Firmware 1.0.2 kills my oaks Wifi and sketches  (Read 17219 times)

saperlot

  • Newbie
  • *
  • Posts: 16
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #30 on: June 28, 2016, 04:12:10 am »
Saperlot won't have rolled back to 1.01. There is a big issue in the Arduino IDE boards manager in that when you remove a board version from the json file, if you had previously had that version installed, it stops working. Hence, for anyone who updated to 1.0.2, when the ide next auto-updates all the boards files, it suddenly goes, 'hey, there's no board package here' (the "Board oak1 (platform oak, package digistump) is unknown" message)... and basically you are up the creek unless you know what is going on.

When that happens, you go into the board manager, remove that offending board/version, and then you install whatever version you can (1.0.1 in this instance... unless you do the beta test 1.0.4).

Even then, Arduino still says:
Code: [Select]
Board oak1 (platform oak, package digistump) is unknown

Error compiling for board Oak by Digistump (Pin 1 Safe Mode - Default).
Happened to me in OSX yesterday and today on a win machine.

trying to remove 1.01 gives:
Code: [Select]
Could not find boards.txt in C:\Users\[username]\AppData\Local\Arduino15\packages\digistump\hardware\oak\1.0.1. Is it pre-1.5?
this Folder exists but doesn't have files in there. Arduino still thinks that this board is installed.
Removing the folder 1.01 gives me the option to install it again in Arduino. Doing this will result still in the same "Board oak1 is unknown". Only removing the folder 1.0.2 C:\Users\[username]\AppData\Local\Arduino15\packages\digistump\hardware\oak\ gives me again the option to build it successfully.
i use Arduino 1.6.9.

So please, never ever delete items in your package Json.

kh

  • Jr. Member
  • **
  • Posts: 64
  • OakTerm developer
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #31 on: June 28, 2016, 10:18:42 am »
Thanks for the explanation.

So it sounds like my instructions for 1.0.4 beta should also say (between steps 1 and 2):

  • First remove all existing "Oak by Digistump" boards with Boards Manager
  • If this gives an error, delete the corresponding folder from "AppData\Local\Arduino15\packages\digistump\hardware\oak\"
  • Reinstall the latest version available with Boards Manager

Can you confirm this is correct?

@PeterF, for people that have not made any changes in Boards Manager, do you know if things will start working again, without any intervention on their part, if @digistump restores 1.0.2 on his server?

saperlot

  • Newbie
  • *
  • Posts: 16
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #32 on: June 29, 2016, 01:41:53 am »
Thanks for the explanation.

So it sounds like my instructions for 1.0.4 beta should also say (between steps 1 and 2):

  • First remove all existing "Oak by Digistump" boards with Boards Manager
  • If this gives an error, delete the corresponding folder from "AppData\Local\Arduino15\packages\digistump\hardware\oak\"
  • Reinstall the latest version available with Boards Manager

I can't reproduce it, but should work..

Can you confirm this is correct?

@PeterF, for people that have not made any changes in Boards Manager, do you know if things will start working again, without any intervention on their part, if @digistump restores 1.0.2 on his server?

PeterF

  • Hero Member
  • *****
  • Posts: 877
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #33 on: June 29, 2016, 07:54:05 pm »
I'm not sure... I don't see why not so I think is just a version check, but the Arduino folks have some strange ideas at times! :-O I expect that the 1.0.2 board could be restored in the json, and when they either manually refresh by opening the board manager and letting it check for updates, or just leave the IDE open for a few minutes so it does it itself in the background, all should be good again...  And yes, your updated instructions should be good... the oak folder should just be emptied of its contents...

kh

  • Jr. Member
  • **
  • Posts: 64
  • OakTerm developer
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #34 on: June 29, 2016, 08:13:29 pm »
Thanks for the feedback @saperlot and @PeterF. I've updated my 1.0.4 beta install instructions post.

digistump

  • Administrator
  • Hero Member
  • *****
  • Posts: 1465
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #35 on: July 02, 2016, 09:46:01 am »
I apologize both for the delay on my part and the issues caused by pulling 1.0.2 - even after years of working with the Arduino IDE I still don't always understand everything it does or that will cause problems.

I've restored 1.0.2 in the JSON so people can upgrade from it, and added 1.0.3r2 which is the same as kh's beta 1.0.4 (many thanks to him for doing all the hard work the last two releases)

If anyone can test 1.0.3r2 and see if it makes their oak's happy, I'd greatly appreciate it

digistump

  • Administrator
  • Hero Member
  • *****
  • Posts: 1465
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #36 on: July 02, 2016, 10:10:44 am »
actually it is released as 1.0.4 instead of 1.0.3r2 as Arduino IDE doesn't accept any other type of version number

PeterF

  • Hero Member
  • *****
  • Posts: 877
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #37 on: July 03, 2016, 01:40:52 am »
I'll be testing it out shortly ;)

The Arduino IDE has some interesting quirks of late, the board manager and library manager being sometimes more of a headache and hindrance than a benefit. Once you work out WTH it's doing though, everything is usually ok. Another more recent 'improvement'  that I'm still not happy with is the library matching rules they use, but I can understanding the reasoning behind why they work how they do... but that has made it so I have a very peculiar setup with separate sketchbook directories and icons to load those sketchbooks, so I can ensure that Oak only sees oak stuff, STM32 only sees STM32 stuff and Arduino only sees Arduino... annoying, but surprisingly effective once you separate everything!

EDIT: I can break an oak on command!  ;D  :o

I have 1.0.4 on my laptop, and still have 1.0.1 on my desktop. I load some code onto my oak via the desktop-1.0.1, and works great. Load it on via the laptop-1.0.4, and it goes into the dreaded offline/online loop! Cool! 8) Put it into safe mode, and program it again with desktop-1.0.1, and no problem!
« Last Edit: July 03, 2016, 02:51:50 am by PeterF »

kh

  • Jr. Member
  • **
  • Posts: 64
  • OakTerm developer
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #38 on: July 03, 2016, 04:22:45 am »
Hi PeterF. That's not good! We'd better try and get to the bottom of this. Needless to say, I didn't run into this issue in my testing.

Is it code specific - will the blink sketch trigger it?

Do you have any experience with "git bisect"? If so, it would be great if you could use it to find the offending commit between 1.0.1 and 1.0.4.

PeterF

  • Hero Member
  • *****
  • Posts: 877
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #39 on: July 03, 2016, 05:38:33 am »
I haven't played with git too much, so no, not familiar with bisect, but I am happy to give it a try, time permitting.

It could be code specific, the code I used to test was a MQTT client, so has some rather specific code / library imports. However, it was the exact same sketch in both cases, so it still points to some change. I'll try the blink sketch in the morning to rule that out. I should point out that the Oak in question was oakrestored to 1.0.1, and hasn't been updated since - so it is only getting updates via the particle cloud, although that shouldn't make any difference? I'll pull my other sacrifacial oak out tomorrow also and rinse 'n repeat for consistencies sake...


kh

  • Jr. Member
  • **
  • Posts: 64
  • OakTerm developer
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #40 on: July 03, 2016, 12:12:50 pm »
Thanks PeterF.

There's a tutorial on "git bisect" here. Basically, it will do a binary search / divide-and-conquer through each code commit between the known good and bad versions, asking you to test each one.

I'm traveling at the moment, but here are some quick instructions that will hopefully get you started. I'm typing these from memory, so apologies if I get anything wrong. You might need to do some experimenting and googling to fill in the blanks:
  • Install 1.0.4 with Boards Manager
  • Remove the "Arduino15\packages\digistump\hardware\oak\1.0.4" directory
  • Open a command prompt and change to the "Arduino15\packages\digistump\hardware\oak" directory
  • In the command prompt, run "git clone https://github.com/digistump/OakCore.git 1.0.4" (this downloads OakCore into a directory named 1.0.4)
  • Run "git bisect start" (Start a bisect)
  • Run "git bisect bad" (Tell bisect that the current (latest) version is bad)
  • Run "git bisect good 1.0.1" (Tell bisect that 1.0.1 was the last known good version. This automatically updates the current state of the OakCore git repo to half way between the good and bad versions.)
  • Compile, upload and test a sketch
  • Tell bisect whether it was good or bad with "git bisect good" or "git bisect bad". This then updates the current state of the repo to half way between the new good and bad versions.
  • Repeat steps 8-9 above until bisect tells you it has found the first bad commit.

I think it's at least a possibility that this is MQTT specific. The code additions in 1.0.2 add more Particle subscriptions for the OakTerm communications events, and perhaps these, combined with the additional MQTT traffic are enough to clog up the tcp stack so that the Particle Cloud comms reset. The good news is that if your sketch will reliably trigger this issue for everyone, it should be pretty easy to diagnose and fix.

PeterF

  • Hero Member
  • *****
  • Posts: 877
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #41 on: July 04, 2016, 01:19:52 am »
Wow... thanks for that! Thought I'd have to resort to google searches just to get some idea the WTH I'm supposed to do!  ;D

Hm... was about to report that my test subject didn't like a 1.0.4 compile of a blink sketch (just the plain 'ol arduino blink, but pin 1 instead of 13), but as I was writing this it changed it's mind. Thankfully I had Oakterm still open, and caught an interesting message. I'll paste the log below. btw, the only reason I'm programming from safe mode is for consistency... because of the offline/offline cycles 1.0.4 doesn't like to program in 'normal mode' for me... but 1.0.1 is fine.

Code: [Select]
# I had the Oak in config mode, but started OakTerm after starting the Oak. Programming a 1.0.1 blink sketch
[18:08:45] Event: spark/flash/status - started
[18:09:05] Event: spark/status - offline
[18:09:05] Event: spark/flash/status - failed
[18:09:11] Event: spark/status - online
[18:10:27] Event: spark/status - offline

#Oak is happily blinking (user programmed, not triple). I may have switched the power on and off getting the second online message, just to check it was behaving.

# Loading on a 1.0.4 version of the blink sketch. As you can see it didn't go well after programming.
[18:10:41] Config Mode
[18:10:43] Event: spark/flash/status - started
[18:11:06] Event: spark/status - offline
[18:11:06] Event: spark/flash/status - failed
[18:11:12] Event: spark/status - online
[18:11:22] Event: spark/status - offline
[18:11:23] Event: spark/status - online
[18:11:33] Event: spark/status - offline
[18:11:35] Event: spark/status - online

# That's enough of this... power cycle

[18:11:42] Event: spark/status - online
[18:11:52] Event: spark/status - offline
[18:11:54] Event: spark/status - online
[18:12:04] Event: spark/status - offline
[18:12:06] Event: spark/status - online
[18:12:15] Event: spark/status - offline
[18:12:17] Event: spark/status - online
[18:12:26] Event: spark/status - offline
[18:12:27] Event: spark/status - online
[18:12:29] Event: spark/status/safe-mode - {"f":[],"v":{},"p":82,"m":[{"s":1040368,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":6,"d":[]},{"s":1040368,"l":"m","vc":30,"vv":30,"u":"0","f":"u","n":"1","v":1,"d":[{"f":"s","n":"1","v":9,"_":""}]}]}
[18:12:28] Event: spark/status/safe-mode - {"f":[],"v":{},"p":82,"m":[{"s":1040368,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":6,"d":[]},{"s":1040368,"l":"m","vc":30,"vv":30,"u":"0","f":"u","n":"1","v":1,"d":[{"f":"s","n":"1","v":9,"_":""}]}]}

#hm... just noticed it was blinking (user programmed, not triple) as I was writing the forum reply... WTH? Lets see what another power cycle does

[18:13:41] Event: spark/status - online
[18:13:51] Event: spark/status - offline
[18:13:52] Event: spark/status - online
[18:14:02] Event: spark/status - offline
[18:14:03] Event: spark/status - online
[18:14:18] Event: spark/status/safe-mode - {"f":[],"v":{},"p":82,"m":[{"s":1040368,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":6,"d":[]},{"s":1040368,"l":"m","vc":30,"vv":30,"u":"0","f":"u","n":"1","v":1,"d":[{"f":"s","n":"1","v":9,"_":""}]}]}
[18:14:04] Event: spark/status/safe-mode - {"f":[],"v":{},"p":82,"m":[{"s":1040368,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":6,"d":[]},{"s":1040368,"l":"m","vc":30,"vv":30,"u":"0","f":"u","n":"1","v":1,"d":[{"f":"s","n":"1","v":9,"_":""}]}]}

#this is just nuts... is this indicating the error is on particles end? Regardless, the oak is happily blinking again  (user programmed, not triple)

EDIT1: I just programmed the MQTT sketch again with 1.0.4, and left it for longer, and it still kept endlessly offline and online-ing for a good four minutes, and then kicked out a spark/status/safe-mode and then started running the sketch.

Code: [Select]
[18:23:15] Event: spark/flash/status - started
[18:23:35] Event: spark/status - offline
[18:23:35] Event: spark/flash/status - failed
[18:23:41] Event: spark/status - online
[18:23:51] Event: spark/status - offline
[18:23:52] Event: spark/status - online
[18:24:02] Event: spark/status - offline
[18:24:04] Event: spark/status - online
[18:24:15] Event: spark/status - offline
[18:24:17] Event: spark/status - online
[18:24:27] Event: spark/status - offline
[18:24:28] Event: spark/status - online
[18:24:38] Event: spark/status - offline
[18:24:39] Event: spark/status - online
[18:24:49] Event: spark/status - offline
[18:24:50] Event: spark/status - online
[18:25:00] Event: spark/status - offline
[18:25:02] Event: spark/status - online
[18:25:12] Event: spark/status - offline
[18:25:14] Event: spark/status - online
[18:25:22] Event: spark/status - offline
[18:25:24] Event: spark/status - online
[18:25:34] Event: spark/status - offline
[18:25:35] Event: spark/status - online
[18:25:45] Event: spark/status - offline
[18:25:46] Event: spark/status - online
[18:25:54] Event: spark/status - offline
[18:25:56] Event: spark/status - online
[18:26:05] Event: spark/status - offline
[18:26:07] Event: spark/status - online
[18:26:16] Event: spark/status - offline
[18:26:18] Event: spark/status - online
[18:26:28] Event: spark/status - offline
[18:26:29] Event: spark/status - online
[18:26:38] Event: spark/status - offline
[18:26:41] Event: spark/status - online
[18:26:51] Event: spark/status - offline
[18:26:52] Event: spark/status - online
[18:27:02] Event: spark/status - offline
[18:27:03] Event: spark/status - online
[18:27:12] Event: spark/status - offline
[18:27:14] Event: spark/status - online
[18:27:26] Event: spark/status - offline
[18:27:27] Event: spark/status - online
[18:27:32] Event: spark/status/safe-mode - {"f":[],"v":{},"p":82,"m":[{"s":1040368,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":6,"d":[]},{"s":1040368,"l":"m","vc":30,"vv":30,"u":"0","f":"u","n":"1","v":1,"d":[{"f":"s","n":"1","v":9,"_":""}]}]}
[18:27:29] Event: spark/status/safe-mode - {"f":[],"v":{},"p":82,"m":[{"s":1040368,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":6,"d":[]},{"s":1040368,"l":"m","vc":30,"vv":30,"u":"0","f":"u","n":"1","v":1,"d":[{"f":"s","n":"1","v":9,"_":""}]}]}

EDIT2: And just to show it isn't just a temperamental Oak, I have subjected another one to the same abuse and it didn't like it either! I noticed this time (since I was staring at it waiting for it to behave that it would periodically do the programmed blink, but the timing was way off. i.e. one second on, 10 seconds off). I assume (I know, not a good idea!) that the other one was doing the same but I just didn't see it. It got fed up with this one after nearly 10 minutes of cycling, power cycled it and no change after two minutes - still cycling.

Code: [Select]

#lets try the blink on a different oak, with 1.0.1

[18:33:24] Device change: Oak4
[18:33:19] Event: spark/flash/status - started
[18:33:47] Event: spark/status - offline
[18:33:47] Event: spark/flash/status - failed
[18:33:58] Event: spark/status - online
[18:34:54] Event: spark/status - online

#user programmed blink, now lets try 1.0.4...

[18:35:06] Config Mode
[18:36:40] Event: spark/flash/status - started
[18:37:02] Event: spark/status - offline
[18:37:02] Event: spark/flash/status - failed
[18:37:08] Event: spark/status - online
[18:37:17] Event: spark/status - offline
[18:37:22] Event: spark/status - online
[18:37:32] Event: spark/status - offline
[18:37:33] Event: spark/status - online
[18:37:42] Event: spark/status - offline
[18:37:44] Event: spark/status - online
[18:37:53] Event: spark/status - offline
[18:37:55] Event: spark/status - online
[18:38:03] Event: spark/status - offline
[18:38:05] Event: spark/status - online
[18:38:14] Event: spark/status - offline
[18:38:18] Event: spark/status - online
[18:38:27] Event: spark/status - offline
[18:38:31] Event: spark/status - online
[18:38:40] Event: spark/status - offline
[18:38:44] Event: spark/status - online
[18:38:53] Event: spark/status - offline
[18:38:57] Event: spark/status - online
[18:39:06] Event: spark/status - offline
[18:39:10] Event: spark/status - online
[18:39:19] Event: spark/status - offline
[18:39:22] Event: spark/status - online
[18:39:32] Event: spark/status - offline
[18:39:35] Event: spark/status - online
[18:39:45] Event: spark/status - offline
[18:39:49] Event: spark/status - online
[18:39:59] Event: spark/status - offline
[18:40:03] Event: spark/status - online
[18:40:13] Event: spark/status - offline
[18:40:16] Event: spark/status - online
[18:40:26] Event: spark/status - offline
[18:40:29] Event: spark/status - online
[18:40:39] Event: spark/status - offline
[18:40:42] Event: spark/status - online
[18:40:52] Event: spark/status - offline
[18:40:55] Event: spark/status - online
[18:41:05] Event: spark/status - offline
[18:41:08] Event: spark/status - online
[18:41:18] Event: spark/status - offline
[18:41:21] Event: spark/status - online
[18:41:32] Event: spark/status - offline
[18:41:35] Event: spark/status - online
[18:41:45] Event: spark/status - offline
[18:41:48] Event: spark/status - online
[18:41:57] Event: spark/status - offline
[18:42:00] Event: spark/status - online
[18:42:11] Event: spark/status - offline
[18:42:14] Event: spark/status - online
[18:42:24] Event: spark/status - offline
[18:42:27] Event: spark/status - online
[18:42:37] Event: spark/status - offline
[18:42:40] Event: spark/status - online
[18:42:49] Event: spark/status - offline
[18:42:52] Event: spark/status - online
[18:43:01] Event: spark/status - offline
[18:43:04] Event: spark/status - online
[18:43:14] Event: spark/status - offline
[18:43:17] Event: spark/status - online
[18:43:27] Event: spark/status - offline
[18:43:30] Event: spark/status - online
[18:43:39] Event: spark/status - offline
[18:43:43] Event: spark/status - online

#that's enough of this... what about a power cycle?

[18:43:54] Event: spark/status - online
[18:44:04] Event: spark/status - offline
[18:44:06] Event: spark/status - online
[18:44:16] Event: spark/status - offline
[18:44:18] Event: spark/status - online
[18:44:30] Event: spark/status - online
[18:44:38] Event: spark/status - offline
[18:44:40] Event: spark/status - online
[18:44:49] Event: spark/status - offline
[18:44:51] Event: spark/status - online
[18:45:00] Event: spark/status - offline
[18:45:03] Event: spark/status - online
[18:45:13] Event: spark/status - offline
[18:45:16] Event: spark/status - online
[18:45:26] Event: spark/status - offline
[18:45:29] Event: spark/status - online
[18:45:40] Event: spark/status - offline
[18:45:43] Event: spark/status - online
[18:45:52] Event: spark/status - offline
[18:45:56] Event: spark/status - online
[18:46:04] Event: spark/status - offline

Next up, bisections... but that will probably have to be a job for tomorrow. I'll shut up for now!  :o  ;D

EDIT3: I know I said I'd shut up... but this is important... one of the oaks started offlining and onlining even in safe mode, and wouldn't co-operate... so I oak-restored it and it made no difference at all once it had done the update and been configured - it went config mode, no user rom found and back to it's old tricks. I did a serial upload, and it kept doing it, and then finally triggered a 'spark/status/safe-mode' event again before starting the user sketch properly. I'm chalking this up to a fault on the sparkfun servers for now, as this is on the 1.0.1 config that was working perfectly fine until just now.
« Last Edit: July 04, 2016, 02:45:28 am by PeterF »

kh

  • Jr. Member
  • **
  • Posts: 64
  • OakTerm developer
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #42 on: July 04, 2016, 02:25:47 pm »
PeterF's experience is probably dissuading other people from trying 1.0.4, but if you have tried it, I'd be very interested to know if you have experienced the same issues he has.

kh

  • Jr. Member
  • **
  • Posts: 64
  • OakTerm developer
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #43 on: July 04, 2016, 02:58:19 pm »
PeterF, that is very odd.

As far as I can tell, the spark/status/safe-mode event is being generated by the Cloud using data sent by the Oak in response to the DESCRIBE request in the Particle Cloud handshake. However, the weird thing is that this data appears to be sent every time any Oak connects to the Cloud, and this part of the code is the same in 1.0.4 as it is in 1.0.1, yet I've never seen this message before.

One other question. I'm not sure if you had a chance to check, but after the safe-mode event and the sketch started running, where you able to upload code OTA?

PeterF

  • Hero Member
  • *****
  • Posts: 877
Re: Firmware 1.0.2 kills my oaks Wifi and sketches
« Reply #44 on: July 04, 2016, 09:53:46 pm »
I couldn't say earlier, as as I mentioned in EDIT3 above, at least one of the two test subjects started it's endless online offline loop after entering safe mode, and even after a fresh firmware wipe over serial (using the firmware image downloaded last night from https://oakota.digistump.com/firmware/firmware_v1.bin), which is why I'm pointing the finger squarely at the particle server ATM.

I just tried again with subject 1 (Oak5), and after triggering spark/status/safe-mode and running the user code properly, it was OTA programmable. I'm still waiting on the second test subject (Oak4) to become responsive... LATER: Had visitors, so it had time to trip. Oak4 also programmed once the user program was running (blink sketch). It is now doing the offline/online loop, so I expect to see it running normally soon.