Why do computers fail? Insights from Fixfest 2019

October 22, 2019

Our second Open Repair Data dive was held in September at Mozilla Berlin as part of FixFest 2019

About 20 repair data enthusiasts reviewed 606 records that sprang from recent repair events held by anstiftung and The Restart Project. With recent campaign successes around the right to repair domestic appliances, and with policy on computers next on the agenda, we wanted to look into the reasons for failures of computers that we see at our community repair events. 

Diving into Open Repair Data

At the FixFest data event, participants read through spreadsheets of Restart and anstiftung repair data for desktops, laptops and tablets brought to recent repair events. 

Each of the datasets contain a free-text field – problem – where volunteers at repair events have entered information about the nature of the issue as presented.  Participants were asked to classify each record by selecting a predefined fault_type and fault_category from lists.  

 

Open Repair Data : Fault categorisation event at Mozilla Berlin, 2019

Fault categorisation event at Mozilla Berlin, 2019

What we see in the raw data is a variety of opinions and writing styles reflecting the variety of volunteers that participate in repair events.

The content of the problem field can vary from nothing at all to a brief …

Not working

… to concise summaries such as …

Water damage. Used compressed air and isopropyl alcohol to dry/clean ram and gpu slots

… to short essays … 

Water Damage, the logic board was heavily corroded. I used a toothbrush and isopropanol and even dunked it in an isopropanol bath for a while to get rid of all the minerals.A proper inspection was not possible due to the lack of a microscope.

While all of this information is useful, it does not lend itself easily to summarisation, statistics, reporting or visualisation.

Defining a set of “fault types” is an attempt to consolidate the “problems” that are presented at repair events and produce a dataset that can be analysed for overall trends and stories.

What is a fault type?

A description of a “fault” varies depending on who is describing it. To the user of the device, the fault is that it fails their requirements in some respect. To an experienced repairer, the fault will be determined via parameters such as the device type, brand, model, age, maintenance etc.

The list of “fault types” we are using, seen in the list below, came about as a result of the first event on Open Data Day in London 2019. Data volunteers were asked to select from lists or enter their own “fault type”. Analysis of the thousands of results led to a consolidated list covering laptops, desktops and tablets in general. Some of the fault types suggested by the volunteers have been included, e.g. “Performance” and “Overheating”.

Open Repair Data : pick list of fault types

This is a working list, open to suggestions for amendments and additions.  Our work on the Open Repair Data Standard involves finding a common set of lists for things such as product categories, fault types, and repair outcomes – contact us if you’d like to get involved in discussions on these topics.

Results from the Open Repair Data dive

By the end of the evening we’d worked through all of the records amidst much chat about the descriptions of the problems, the list of fault types and the nature of the task. Also many excellent caffeinated drinks courtesy of Mozilla!

Common fault types

88% of the repairs we looked at were judged to have enough information to be given a classification, with the remainder being classed as “Unknown” – discussed below. 

Across all the categories of laptops, desktops and tablets, there was a broad split of fault types, with the top 5 known faults being power/battery, configuration, ports/slots/connectors, integrated screen and performance.

Open Repair Data : FixFest 2019 : Top 5 Fault Types

The most common fault types differ between specific product categories.  For laptops, the most common faults in the data were configuration, ports/slots/connectors and performance. The slice labelled “others” covers all other fault types combined, e.g. “Overheating”, “Internal damage”, “Virus/malware” etc.

For tablets, power/battery was the most commonly seen fault_type, followed by the integrated screen and then ports/slots/connectors.

Open Repair Data : FixFest 2019 : Top 5 Fault Products

Success rates

For the different devices that come to community repair events, we also report the top-level repair outcome – was it fixed during the event; repairable (for example, with professional help or needing a spare part sourced); or end-of-life.  The fault types with the highest fix rate were: configuration, performance and ports/slots/connectors.

The fault types with the lowest success rates during an event were power/battery, system board and integrated screen.  However, it is worth comparing this to repairable rates.  For example, power/battery has a good ratio of devices marked as repairable – it could be that a spare part is all that is required.  Or for a damaged system board, it could be that a trip to a local repair shop is suggested.

Open Repair Data : FixFest 2019 : Top 10 Fault Fixes

Hardware vs Software

In addition to fault type, we also classed faults as either a hardware problem or a software problem – the fault_category.  There is no predefined correlation between fault_type and fault_category, some issues can be categorised as any of these, e.g. “Performance” could be due to either “Software” or “Hardware” and often there simply isn’t enough information to go on.

In the recent data we saw that the majority of faults seen were hardware faults.  The success rate for hardware faults was 39.4%, whereas the success rate for software faults was 73%.

FixFest 2019 Fault Categories

Please note that these are only suggestive results from a small data sample and each represents the opinion of only one or two people.  They are the types of questions that can be explored further with more Open Repair Data and more people contributing to the classification and analysis. If you are part of an organisation that would like to share data, or would like to get involved with classification, analysis or visualisation of the data, please get in touch!

Lessons learned

For the fault types classified on the evening, 17.7% were classified as “Unknown“.  Following review this was reduced to 12.4%. Classification of a fault can be very subjective, and interpretation of the options varied from person to person. The fault_type values represent umbrella terms and some people felt limited by the looser options but we have to keep in mind that the list needs to be pretty concise – and we’ve not even begun to include fault types for products outside of the desktop, laptop and tablet categories yet.

A couple of the most common debates centred on vague terms such as “slow” and when to select fault_typeUnknown”, for example:

The problem simply states …

Slow

… and someone selects “Unknown”. There is very little to go on as far as determining the exact nature of the fault, however “Performance” is the perceived issue even if the specifics or the solution are unknown.

Or the problem states …

User reports slow to load programmes. Fixed by upgrading RAM

… and volunteer suggests the issue is “Not enough memory” so they choose “Other”. There was apparently nothing wrong with the existing RAM or any other hardware, instead, the functionality of the installed software was impaired due to its memory requirement. The solution in this case was to increase the amount of RAM but an alternate solution might be to install different software. The solution does not necessarily dictate the type of fault. In this instance a case could be made for selecting either “Performance”, “Operating System” or “Configuration”. Debatable!

Open Repair Data : List of fault type descriptions

What next?

The aim of Open Repair Data fault classification is not necessarily to pinpoint each exact fault cause, but to group repairs into streams that can be reported and visualised. We’ll be working with our policy partners in Brussels to explore this data further.

We want to continue and improve fault classification to provide more and better analysis. To achieve this we could enhance the list of fault types, provide clearer guidance for data entry and classification and then get more eyes on each record to see if there is  consensus.

We are also investigating ways to improve data quality at the point of capture – always bearing in mind the busy nature of community repair events!  

Finally, using our whole dataset, we would like to explore how community repair data compares with fault types collected by independent repair businesses and in industry. 

Watch out for blog posts about these topics soon!

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*