Skip to content

Failure Modes

This page documents every known failure mode in the SmartPot system, the automatic responses to each, and what the operator should expect when things go wrong.

SmartPot operates in a harsh, remote environment where no failure can be addressed instantly. The system is designed around three priorities, in order:

  1. Fail safe. When in doubt, release catch. A jammed door defaults to unlocked. A severed tether causes the submerged unit to open the door on a timeout. No failure mode should result in marine life trapped indefinitely --- this is both an ethical and regulatory requirement.
  2. Preserve position data. A pot with no communication is still recoverable if you know where it is. Even in critical battery depletion, the buoy wakes periodically to broadcast a GPS beacon.
  3. Alert the operator. Every fault condition generates an ALERT (opcode 0x83) so the operator can make informed decisions about recovery priority. See Command Protocol for ALERT packet details.

The system does not attempt to be clever under failure. It degrades predictably, tells you what happened, and protects the catch and the gear.

The smart buoy cannot reach the base station. Causes include range exceedance, RF interference, antenna damage or corrosion, and atmospheric ducting effects.

AspectDetail
DetectionBase station tracks per-pot “last seen” timestamp. Stale timestamp triggers operator alert after configurable threshold (default: 3 missed telemetry cycles = 45 minutes at standard 15-minute intervals).
Buoy behaviorContinues normal operation. Telemetry packets are written to a flash ring buffer (see Flash Storage Full for capacity limits). Catch classification and door control proceed uninterrupted.
RecoveryWhen LoRa link is restored, the buoy transmits buffered telemetry in chronological order, oldest first. Each buffered packet is tagged with its original timestamp so the base station can reconstruct the timeline.
Operator actionBase station UI displays “last seen” age and signal strength trend. If link is not restored within operator-defined threshold, physical retrieval is recommended.

The wired connection between the submerged unit (inside the pot) and the smart buoy (at the surface) is severed or shorted. Causes include abrasion on the pot frame, marine growth, bite damage, or storm-induced mechanical stress.

AspectDetail
DetectionBuoy detects loss of heartbeat signal from the submerged unit (expected every 5 seconds over the tether).
Buoy behaviorSends ALERT opcode 0x83 with alert type TETHER_FAULT. Continues transmitting telemetry (GPS, battery, buoy-side sensors) but catch classification data goes stale --- last known counts are reported with a CAMERA_FAULT flag set.
Submerged unit behaviorDetects loss of tether power/comms. Enters autonomous fallback mode: door unlocks after a configurable timeout (default: 4 hours). This prevents trapping catch indefinitely in a unit that can no longer communicate.
Operator actionTether faults generally require physical retrieval. The pot is still fishing but without classification or remote control.

Smart buoy battery voltage drops below operating thresholds. Causes include solar panel fouling (bird droppings, algae, salt crust), extended overcast weather, panel physical damage, or abnormally high duty cycle.

AspectDetail
DetectionBattery voltage monitored continuously. Thresholds defined below.
20% threshold (~3.5V)ALERT opcode 0x83 with alert type LOW_BATTERY_WARNING. Telemetry interval doubles (30 minutes) to conserve power. LoRa TX power reduced to +14 dBm.
10% threshold (~3.3V)ALERT opcode 0x83 with alert type LOW_BATTERY_CRITICAL. Classification and tether power to submerged unit are disabled. Telemetry reduced to every 60 minutes. The LOW_BATTERY flag is set in the telemetry bitfield (see Telemetry Format).
5% threshold (~3.1V)System enters deep sleep. All subsystems powered down except the RTC and GNSS. Buoy wakes every 6 hours, acquires a GPS fix, transmits a single beacon packet (position + battery voltage), and returns to sleep.
Operator actionRetrieve and service the buoy. Check solar panel for fouling or damage. A healthy panel in normal conditions should maintain indefinite operation.

The buoy’s GPS position exceeds a configurable distance threshold from the recorded deployment coordinates. Causes include tidal current, storm surge, anchor drag, or the pot being caught in a trawl.

AspectDetail
DetectionEach GNSS fix is compared against deployment coordinates stored in NVS. Threshold default: 50 meters.
Buoy behaviorALERT opcode 0x83 with alert type DRIFT_ALERT. Payload includes displacement distance (meters) and bearing (degrees) from deployment point. The DRIFT_ALERT flag is set in the telemetry bitfield. Telemetry interval increases to every 5 minutes for position tracking.
Operator actionEvaluate distance and bearing. Options: (1) send STATUS command to get current full telemetry, (2) update the expected deployment coordinates via SET_DEPLOY_POS if the pot settled in an acceptable location, or (3) schedule physical retrieval.
Sustained driftIf position changes continuously (pot being dragged), drift alerts repeat every 5 minutes with updated distance/bearing. This pattern is distinctive from a one-time displacement and indicates the pot is in motion.

The servo-driven door latch on the submerged unit jams, fails to respond, or draws abnormal current. Causes include sediment intrusion, corrosion, mechanical obstruction, or servo motor burnout.

AspectDetail
DetectionServo current monitoring on the submerged unit. Expected current draw for a full open/close cycle is 150—400mA for 200—500ms. A jam is detected when current exceeds 600mA for more than 1 second, or when the expected position feedback is not reached.
Submerged unit behaviorReports servo fault to buoy over tether. ALERT opcode 0x83 with alert type SERVO_FAULT. The SERVO_FAULT flag is set in telemetry.
FailsafeDoor defaults to unlocked after a configurable timeout (default: 2 hours from last successful actuation). This is a deliberate design choice --- an unlocked door means catch can escape, but it also means bycatch and undersized animals are not trapped indefinitely. The alternative (defaulting to locked on failure) risks regulatory non-compliance and marine mortality.
Operator actionRetrieve the pot for mechanical inspection. The pot may still be fishing passively (funnel entries still work) but without selective harvest capability.

The smart buoy’s local telemetry ring buffer reaches capacity. This occurs when communication loss persists long enough for the buffer to fill.

AspectDetail
Buffer capacity8MB flash allocation, ~4000 telemetry packets (at 24 bytes each, plus timestamp and metadata overhead). At 15-minute intervals, this represents approximately 40 days of buffered telemetry.
DetectionBuffer utilization tracked internally. Warning ALERT sent at 80% capacity.
BehaviorRing buffer --- oldest entries are overwritten when the buffer is full. The buoy continues operating normally. No data loss for recent events; only the oldest history is discarded.
Operator actionNot urgent in isolation, but a full buffer means the buoy has been out of contact for over a month. The communication loss itself is the primary concern.

A command sent from the base station receives no ACK after the maximum retry count. See Acknowledgment Protocol for the full handshake specification.

AspectDetail
Retry behaviorBase station retransmits the command up to 3 times, with 10-second intervals between attempts. Each retransmit uses the same sequence number (the buoy deduplicates via the replay protection counter --- see Replay Protection).
After 3 failuresCommand marked as failed in the base station log. Operator receives an alert with the pot ID and failed command.
No automatic retry beyond 3The system does not continue retrying indefinitely. The operator decides whether to attempt the command again, try at closer range, or physically retrieve the pot. This prevents a failed command from consuming LoRa airtime and battery on both ends.
Critical commandsFor SURFACE, LOCK_DOOR, and UNLOCK_DOOR, delivery failure is highlighted with elevated priority in the operator UI since these commands typically reflect time-sensitive operational decisions.
FailureDetectionAutomatic ResponseOperator Alert
LoRa lossStale timestamp at baseBuffer telemetry, retransmit on restoreLast-seen age warning
Tether breakLost heartbeatSubmerged unit unlocks door on timeoutTETHER_FAULT ALERT
Battery depletionVoltage thresholdsProgressive power-down, GPS beacon at 5%LOW_BATTERY ALERTs at 20% and 10%
GPS driftPosition vs. deployment coordsIncreased telemetry rate, position trackingDRIFT_ALERT with distance/bearing
Door jamServo current monitoringDoor defaults to unlocked on timeoutSERVO_FAULT ALERT
Flash fullBuffer utilizationRing buffer overwrites oldestWarning ALERT at 80%
Command failureNo ACK after 3 retriesCommand marked failed, no further retryFailed command alert

Some failures compound. The most common cascading scenario:

Solar failure leads to battery depletion leads to tether power loss leads to autonomous door unlock. In this chain, the system degrades through well-defined stages. The operator receives alerts at each stage (low battery, then tether fault), giving time to prioritize retrieval. Even in the worst case --- total power loss with no prior warning --- the submerged unit’s autonomous timeout ensures the door unlocks and catch is released.

Communication loss masks other failures. If the LoRa link is down, alerts for tether faults, battery depletion, or door failures cannot reach the operator in real time. These alerts are buffered and delivered when communication is restored. The base station’s stale-timestamp warning serves as a catch-all: if you haven’t heard from a pot in a while, something may be wrong regardless of the specific cause.