Data Processing

What is data processing?

A simple definition of data processing is a sequence for collecting raw data and turning it into meaningful information. This sequence is usually done by a computer allowing us to quickly gain insight from large amounts of data.

Data processing steps

There are several steps (also functions or tasks) that may be used to process data depending on what raw data you have and what you need to know from it.

  • Validation: ensures data is relevant and correct.
  • Sorting: organizes data into a sequence and/or in particular sets.
  • Aggregation: brings multiple pieces of data together
  • Summarization: reduces detailed data into key points
  • Analysis: discovers, interprets, and communicates meaningful patterns
  • Reporting: presents data
  • Classification: divides data into different groups

Types of data processing

A computer can process data in different ways depending on amount of data, time requirements, computing power and availability, etc. Some of these types of data processing can become quite technical, but here’s a simple overview.

  • Batch processing: As the name implies, this type of data processing takes a chunk of data (in sequential order), processes it, and once it’s all complete, returns the insights for that chunk of data. This type helps reduce processing costs for large amounts of data.
  • Real-time or online processing: Probably the most familiar, this type of processing simultaneously receives and processes the data, providing immediate results. This requires an internet connection and the data is stored online.
  • Multiprocessing: This efficiently utilizes two or more independent computer brains (technical term is central processing unit or CPU) to process data simultaneously. The processing tasks are divided between whichever CPUs are currently available to reduce processing time and maximize throughput.
  • Time sharing: This type is where several uses rely on a single CPU to process data. Users share the processing time and therefore have allocated time slots for processing data. Sometimes this type is called multi-access system.

Examples of data processing

Examples of data processing can be seen in many common activities - some we may take for granted such as paying with a credit card or taking a picture on a phone. For the latter, the camera lens captures the raw data (color, light, etc.) and converts it into a photo file that can be easily edited, shared, or printed.

Transactions - whether paying by credit card or transferring money internationally - also require data processing to collect, verify, and format the payment credentials before the bank or other financial institution can accept it.

Another example is the automated billing for a Software-as-a-Service (SaaS) business. The computer summarizes the charges for each customer’s service plan and converts the charges to a monthly or annual invoice that is billed automatically.

Self-driving vehicles are a somewhat less common example, but illustrate a significant array of processed data. The sensors around the car provide tons of raw data about navigation, other vehicles/people, driving conditions, color of red lights, street signage, and more. All of that data is then processed in real-time to dictate when to go, stop, turn, change lanes, accelerate, signal, etc.

Additional resources to learn more about data processing