Your browser does not support JavaScript! Skip to main content
Free 30-day trial Customer portal Careers DO-178C Handbook
 
Rapita Systems
 

Industry leading verification tools & services

Rapita Verification Suite (RVS)

  RapiTest - Unit/system testing   RapiCover - Structural coverage analysis   RapiTime - Timing analysis (inc. WCET)   RapiTask - Scheduling visualization   RapiCoverZero - Zero footprint coverage analysis   RapiTimeZero - Zero footprint timing analysis   RapiTaskZero - Zero footprint scheduling analysis

Multicore verification

  MACH178   Multicore Timing Solution   RapiDaemons

Services

  V & V Services   Qualification   Training   Tool Integration  Support

Industries

  Aerospace (DO-178C)   Automotive (ISO 26262)   Space

Other

  RTBx   Mx-Suite   Software licensing   Product life cycle policy  RVS development roadmap

Latest from Rapita HQ

Latest news

RVS 3.18 Launched
Solid Sands partners with Rapita Systems
Danlaw Acquires Maspatechnologies - Expanding Rapita Systems to Spain
Rapita co-authored paper wins ERTS22 Best paper award
View News

Latest from the Rapita blog

Why mitigating interference alone isn’t enough to verify timing performance for multicore DO-178C projects
There are how many sources of interference in a multicore system?
Supporting modern development methodologies for verification of safety-critical software
Flexible licensing software fit for modern working
View Blog

Latest discovery pages

do178c DO-178C Guidance: Introduction to RTCA DO-178 certification
matlab_simulink MATLAB® Simulink® MCDC coverage and WCET analysis
code_coverage_ada Code coverage for Ada, C and C++
amc-20-193 AMC 20-193
View Discovery pages

Upcoming events

Aerospace Tech Week Europe 2023
2023-03-29
Aeromart Montreal 2023
2023-04-04
Certification Together International Conference
2023-05-10
View Events

Technical resources for industry professionals

Latest White papers

DO178C Handbook
Efficient Verification Through the DO-178C Life Cycle
A Commercial Solution for Safety-Critical Multicore Timing Analysis
Compliance with the Future Airborne Capability Environment (FACE) standard
View White papers

Latest Videos

Streamlined software verification with RVS 3.18
Sequence analysis with RapiTime
Visualize call dependencies with RVS thumbnail
Visualize call dependencies with RVS
Analyze code complexity thumbnail
Analyze code complexity with RVS
View Videos

Latest Case studies

Supporting ISO 26262 ASIL D software verification for EasyMile
RapiCover’s advanced features accelerate the certification of military UAV Engine Control
Front cover of whitepaper collins
Delivering world-class tool support to Collins Aerospace
View Case studies

Other Downloads

 Webinars

 Brochures

 Product briefs

 Technical notes

 Research projects

Discover Rapita

Who we are

The company menu

  • About us
  • Customers
  • Distributors
  • Locations
  • Partners
  • Research projects
  • Contact us

US office

+1 248-957-9801
info@rapitasystems.com
Rapita Systems, Inc.
41131 Vincenti Ct.
Novi
MI 48375
USA

UK office

+44 (0)1904 413945
info@rapitasystems.com
Rapita Systems Ltd.
Atlas House
Osbaldwick Link Road
York, YO10 3JB
UK

Spain office

+34 930 46 42 72
info@rapitasystems.com
Rapita Systems S.L.
Parc UPC, Edificio K2M
c/ Jordi Girona, 1-3, Office 306-307
Barcelona 08034
Spain

Working at Rapita

Careers

Careers menu

  • Current opportunities & application process
  • Working at Rapita
Back to Top

What really happened to the software on the Mars Pathfinder spacecraft?

Breadcrumb

  1. Home
  2. Blog
  3. What really happened to the software on the Mars Pathfinder spacecraft?
Mike
2013-07-04

It’s the 4th of July. Exactly sixteen years ago today the Mars Pathfinder landed to a media fanfare and began to transmit data back to Earth. Days later and the flow of information and images was interrupted by a series of total systems resets. How this problem was a) diagnosed and b) resolved still makes for a fascinating tale for software engineers.[1]

Diagnosing the issue

The Pathfinder's applications were scheduled by the VxWorks RTOS. Since VxWorks provides pre-emptive priority scheduling of threads, tasks were executed as threads with priorities determined by their relative urgency.

The meteorological data gathering task ran as an infrequent, low priority thread, and used the information bus synchronized with mutual exclusion locks (mutexes). Other higher priority threads took precedence when necessary, including a very high priority bus management task, which also accessed the bus with mutexes. Unfortunately in this case, a long-running communications task, having higher priority than the meteorological task, but lower than the bus management task, prevented it from running.

Soon, a watchdog timer noticed that the bus management task had not been executed for some time, concluded that something had gone wrong, and ordered a total system reset. (Engineers later confessed that system resets had occurred during pre-flight tests. They put these down to a hardware glitch and returned to focusing on the mission-critical landing software.)

Finding a solution

Engineers worked frantically on a lab replica to diagnose and fix the problem, eventually spotting a priority inversion. A priority inversion occurs when a high priority task is indirectly pre-empted by a medium priority task "inverting" the relative priorities of the two tasks (see Figure 1). This is a clear violation of the priority model which says high priority tasks can only be prevented from running by higher priority tasks and briefly by low priority tasks which will quickly complete their use of a resource shared by the high and low priority tasks.

Figure 1: Priority inversion

To fix the problem, they turned on a boolean parameter that indicates whether priority inheritance should be performed by the mutex. The mutex in question had been initialized with the parameter off; had it been on, the priority inversion would have been prevented.

Under priority inheritance the priority of the task that holds the semaphore inherits the priority of a higher priority task when the higher priority task requests the semaphore. In Figure 1, task “low” would inherit the priority of task “high” when that task requested the semaphore. This allows “low” to pre-empt “medium”.

The initialization parameter for the mutex which caused the problem (and those for two others which could have caused the same problem) was stored in a global variable, whose address was in symbol tables also included in the launch software. Because VxWorks contains a C language interpreter intended to allow developers to type in C expressions and functions to be executed during system debugging, it was possible to upload a short C program to the spacecraft, which when interpreted, changed the values of these variables from FALSE to TRUE. This put an end to the system resets.

What did engineers learn?

  • only detailed traces of actual system behavior enabled the faulty execution sequence to be captured and identified – a black box diagnosis without traces would have been impossible;
  • the presence of "debugging" facilities in the system was extremely important – the problem could not have been corrected without the ability to modify the system;
  • spending extra time to ensure priority inheritance correctness at the testing stage, even at some additional performance cost, would have been invaluable.

The origins of the solution

When the keynote speaker referred to a paper which first identified the priority inversion problem and proposed the solution, something extraordinary happened - amazingly, the authors were all in the room and received a rapturous reception. The original paper was:

L. Sha, R. Rajkumar, and J. P. Lehoczky. Priority Inheritance Protocols: An Approach to Real-Time Synchronization. In IEEE Transactions on Computers, vol. 39, pp. 1175-1185, Sep. 1990.


[1]This summary is based on a note written by Mike Jones in December 1997 following an IEEE Real-Time Systems Symposium keynote address by David Wilner, Chief Technical Officer of Wind River Systems.

DO-178C webinars

DO178C webinars

White papers

DO178C Handbook Efficient Verification Through the DO-178C Life Cycle
A Commercial Solution for Safety-Critical Multicore Timing Analysis
Compliance with the Future Airborne Capability Environment (FACE) standard
5 key factors to consider when selecting an embedded testing tool

Related blog posts

Software verification on the Solar Orbiter

.
2021-03-01

Out of the box RVS integration for DDC-I's Deos RTOS

.
2020-02-23

Lightweight instrumentation with RapiTask

.
2018-06-05

How to trace the source of deadlocks

.
2014-08-28

Pagination

  • Current page 1
  • Page 2
  • Next page Next ›
  • Last page Last »
  • Solutions
    • Rapita Verification Suite
    • RapiTest
    • RapiCover
    • RapiTime
    • RapiTask
    • MACH178

    • Verification and Validation Services
    • Qualification
    • Training
    • Integration
  • Latest
  • Latest menu

    • News
    • Blog
    • Events
    • Videos
  • Downloads
  • Downloads menu

    • Brochures
    • Webinars
    • White Papers
    • Case Studies
    • Product briefs
    • Technical notes
    • Software licensing
  • Company
  • Company menu

    • About Rapita
    • Careers
    • Customers
    • Distributors
    • Industries
    • Locations
    • Partners
    • Research projects
    • Contact
  • Discover
    • AMC 20-193
    • What is CAST-32A?
    • Multicore Timing Analysis
    • MC/DC Coverage
    • Code coverage for Ada, C & C++
    • Embedded Software Testing Tools
    • Aerospace Software Testing
    • Automotive Software Testing
    • Certifying eVTOL
    • DO-178C
    • WCET Tools
    • Worst Case Execution Time
    • Timing analysis (WCET) & Code coverage for MATLAB® Simulink®

All materials © Rapita Systems Ltd. 2023 - All rights reserved | Privacy information | Trademark notice Subscribe to our newsletter