Failure Modes
This page documents every known failure mode in the SmartPot system, the automatic responses to each, and what the operator should expect when things go wrong.
Design Philosophy
Section titled “Design Philosophy”SmartPot operates in a harsh, remote environment where no failure can be addressed instantly. The system is designed around three priorities, in order:
- Fail safe. When in doubt, release catch. A jammed door defaults to unlocked. A severed tether causes the submerged unit to open the door on a timeout. No failure mode should result in marine life trapped indefinitely --- this is both an ethical and regulatory requirement.
- Preserve position data. A pot with no communication is still recoverable if you know where it is. Even in critical battery depletion, the buoy wakes periodically to broadcast a GPS beacon.
- Alert the operator. Every fault condition generates an ALERT (opcode
0x83) so the operator can make informed decisions about recovery priority. See Command Protocol for ALERT packet details.
The system does not attempt to be clever under failure. It degrades predictably, tells you what happened, and protects the catch and the gear.
Failure Reference
Section titled “Failure Reference”LoRa Communication Loss
Section titled “LoRa Communication Loss”The smart buoy cannot reach the base station. Causes include range exceedance, RF interference, antenna damage or corrosion, and atmospheric ducting effects.
| Aspect | Detail |
|---|---|
| Detection | Base station tracks per-pot “last seen” timestamp. Stale timestamp triggers operator alert after configurable threshold (default: 3 missed telemetry cycles = 45 minutes at standard 15-minute intervals). |
| Buoy behavior | Continues normal operation. Telemetry packets are written to a flash ring buffer (see Flash Storage Full for capacity limits). Catch classification and door control proceed uninterrupted. |
| Recovery | When LoRa link is restored, the buoy transmits buffered telemetry in chronological order, oldest first. Each buffered packet is tagged with its original timestamp so the base station can reconstruct the timeline. |
| Operator action | Base station UI displays “last seen” age and signal strength trend. If link is not restored within operator-defined threshold, physical retrieval is recommended. |
Tether Failure
Section titled “Tether Failure”The wired connection between the submerged unit (inside the pot) and the smart buoy (at the surface) is severed or shorted. Causes include abrasion on the pot frame, marine growth, bite damage, or storm-induced mechanical stress.
| Aspect | Detail |
|---|---|
| Detection | Buoy detects loss of heartbeat signal from the submerged unit (expected every 5 seconds over the tether). |
| Buoy behavior | Sends ALERT opcode 0x83 with alert type TETHER_FAULT. Continues transmitting telemetry (GPS, battery, buoy-side sensors) but catch classification data goes stale --- last known counts are reported with a CAMERA_FAULT flag set. |
| Submerged unit behavior | Detects loss of tether power/comms. Enters autonomous fallback mode: door unlocks after a configurable timeout (default: 4 hours). This prevents trapping catch indefinitely in a unit that can no longer communicate. |
| Operator action | Tether faults generally require physical retrieval. The pot is still fishing but without classification or remote control. |
Battery Depletion
Section titled “Battery Depletion”Smart buoy battery voltage drops below operating thresholds. Causes include solar panel fouling (bird droppings, algae, salt crust), extended overcast weather, panel physical damage, or abnormally high duty cycle.
| Aspect | Detail |
|---|---|
| Detection | Battery voltage monitored continuously. Thresholds defined below. |
| 20% threshold (~3.5V) | ALERT opcode 0x83 with alert type LOW_BATTERY_WARNING. Telemetry interval doubles (30 minutes) to conserve power. LoRa TX power reduced to +14 dBm. |
| 10% threshold (~3.3V) | ALERT opcode 0x83 with alert type LOW_BATTERY_CRITICAL. Classification and tether power to submerged unit are disabled. Telemetry reduced to every 60 minutes. The LOW_BATTERY flag is set in the telemetry bitfield (see Telemetry Format). |
| 5% threshold (~3.1V) | System enters deep sleep. All subsystems powered down except the RTC and GNSS. Buoy wakes every 6 hours, acquires a GPS fix, transmits a single beacon packet (position + battery voltage), and returns to sleep. |
| Operator action | Retrieve and service the buoy. Check solar panel for fouling or damage. A healthy panel in normal conditions should maintain indefinite operation. |
GPS Drift / Pot Displacement
Section titled “GPS Drift / Pot Displacement”The buoy’s GPS position exceeds a configurable distance threshold from the recorded deployment coordinates. Causes include tidal current, storm surge, anchor drag, or the pot being caught in a trawl.
| Aspect | Detail |
|---|---|
| Detection | Each GNSS fix is compared against deployment coordinates stored in NVS. Threshold default: 50 meters. |
| Buoy behavior | ALERT opcode 0x83 with alert type DRIFT_ALERT. Payload includes displacement distance (meters) and bearing (degrees) from deployment point. The DRIFT_ALERT flag is set in the telemetry bitfield. Telemetry interval increases to every 5 minutes for position tracking. |
| Operator action | Evaluate distance and bearing. Options: (1) send STATUS command to get current full telemetry, (2) update the expected deployment coordinates via SET_DEPLOY_POS if the pot settled in an acceptable location, or (3) schedule physical retrieval. |
| Sustained drift | If position changes continuously (pot being dragged), drift alerts repeat every 5 minutes with updated distance/bearing. This pattern is distinctive from a one-time displacement and indicates the pot is in motion. |
Door Mechanism Failure
Section titled “Door Mechanism Failure”The servo-driven door latch on the submerged unit jams, fails to respond, or draws abnormal current. Causes include sediment intrusion, corrosion, mechanical obstruction, or servo motor burnout.
| Aspect | Detail |
|---|---|
| Detection | Servo current monitoring on the submerged unit. Expected current draw for a full open/close cycle is 150—400mA for 200—500ms. A jam is detected when current exceeds 600mA for more than 1 second, or when the expected position feedback is not reached. |
| Submerged unit behavior | Reports servo fault to buoy over tether. ALERT opcode 0x83 with alert type SERVO_FAULT. The SERVO_FAULT flag is set in telemetry. |
| Failsafe | Door defaults to unlocked after a configurable timeout (default: 2 hours from last successful actuation). This is a deliberate design choice --- an unlocked door means catch can escape, but it also means bycatch and undersized animals are not trapped indefinitely. The alternative (defaulting to locked on failure) risks regulatory non-compliance and marine mortality. |
| Operator action | Retrieve the pot for mechanical inspection. The pot may still be fishing passively (funnel entries still work) but without selective harvest capability. |
Flash Storage Full
Section titled “Flash Storage Full”The smart buoy’s local telemetry ring buffer reaches capacity. This occurs when communication loss persists long enough for the buffer to fill.
| Aspect | Detail |
|---|---|
| Buffer capacity | 8MB flash allocation, ~4000 telemetry packets (at 24 bytes each, plus timestamp and metadata overhead). At 15-minute intervals, this represents approximately 40 days of buffered telemetry. |
| Detection | Buffer utilization tracked internally. Warning ALERT sent at 80% capacity. |
| Behavior | Ring buffer --- oldest entries are overwritten when the buffer is full. The buoy continues operating normally. No data loss for recent events; only the oldest history is discarded. |
| Operator action | Not urgent in isolation, but a full buffer means the buoy has been out of contact for over a month. The communication loss itself is the primary concern. |
Command Delivery Failure
Section titled “Command Delivery Failure”A command sent from the base station receives no ACK after the maximum retry count. See Acknowledgment Protocol for the full handshake specification.
| Aspect | Detail |
|---|---|
| Retry behavior | Base station retransmits the command up to 3 times, with 10-second intervals between attempts. Each retransmit uses the same sequence number (the buoy deduplicates via the replay protection counter --- see Replay Protection). |
| After 3 failures | Command marked as failed in the base station log. Operator receives an alert with the pot ID and failed command. |
| No automatic retry beyond 3 | The system does not continue retrying indefinitely. The operator decides whether to attempt the command again, try at closer range, or physically retrieve the pot. This prevents a failed command from consuming LoRa airtime and battery on both ends. |
| Critical commands | For SURFACE, LOCK_DOOR, and UNLOCK_DOOR, delivery failure is highlighted with elevated priority in the operator UI since these commands typically reflect time-sensitive operational decisions. |
Failure Summary
Section titled “Failure Summary”| Failure | Detection | Automatic Response | Operator Alert |
|---|---|---|---|
| LoRa loss | Stale timestamp at base | Buffer telemetry, retransmit on restore | Last-seen age warning |
| Tether break | Lost heartbeat | Submerged unit unlocks door on timeout | TETHER_FAULT ALERT |
| Battery depletion | Voltage thresholds | Progressive power-down, GPS beacon at 5% | LOW_BATTERY ALERTs at 20% and 10% |
| GPS drift | Position vs. deployment coords | Increased telemetry rate, position tracking | DRIFT_ALERT with distance/bearing |
| Door jam | Servo current monitoring | Door defaults to unlocked on timeout | SERVO_FAULT ALERT |
| Flash full | Buffer utilization | Ring buffer overwrites oldest | Warning ALERT at 80% |
| Command failure | No ACK after 3 retries | Command marked failed, no further retry | Failed command alert |
Cascading Failures
Section titled “Cascading Failures”Some failures compound. The most common cascading scenario:
Solar failure leads to battery depletion leads to tether power loss leads to autonomous door unlock. In this chain, the system degrades through well-defined stages. The operator receives alerts at each stage (low battery, then tether fault), giving time to prioritize retrieval. Even in the worst case --- total power loss with no prior warning --- the submerged unit’s autonomous timeout ensures the door unlocks and catch is released.
Communication loss masks other failures. If the LoRa link is down, alerts for tether faults, battery depletion, or door failures cannot reach the operator in real time. These alerts are buffered and delivered when communication is restored. The base station’s stale-timestamp warning serves as a catch-all: if you haven’t heard from a pot in a while, something may be wrong regardless of the specific cause.