When to recycle vs. restructure your data

To recycle or restructure data?

Recycling means taking old data initially used to solve one problem and using it to make inferences about newer issues — at least until companies obtain live data that either confirms or denies what the recycled data shows.

People sometimes refer to data recycling as data reuse, especially if multiple departments in a company use the data to achieve different aims. In contrast, data restructuring means manipulating existing data — such as that within a table or application — to make it more relevant for answering a question posed by a company or solving a problem.

Now, let’s look at some factors that should help you decide whether to recycle or restructure your data.

1. Consider recycling data when it may benefit others

If your data is the sort that could help others conduct research — and you have permission to share it based on data privacy regulations — recycling could be an excellent idea. Ohio State University has a data recycling project for information taken from surveys. It hopes to assist social scientists by providing them with free access to data that could help them overcome past challenges.

A lack of access to data often means that those researchers have trouble carrying out comparative analyses, especially if the data they have is from only one part of the world or not closely related enough for them to make accurate comparisons.

The goal of the project is to organise the original data and the recycled data, then combine them into a single integrated database. Many companies lack data that’s appropriate to share with others, and that may be the case for you too. If not, think about whether recycling the data could make overall improvements for people in other areas of your company or those outside of it.

Data backup and security: not revolutionary, but required

Data backup and security might not be the most exciting of technologies, but it is more important than ever before, Sooraj Shah finds out why

2. Think about restructuring data when your reporting process becomes too time-consuming

As companies continually collect data, it sometimes becomes apparent that the data growth makes it take too long to produce reports. When you’re dealing with increased reporting requirements or the need for deeper analysis, data restructuring could allow you to dig into the data and extract meaningful insights, plus shorten the time required to create reports.

It may be useful to keep track of which steps of creating a report take the longest, then evaluate if an inappropriate data structure causes those slowdowns or if there’s something else going on to make the inefficiency happen.

Even if your data allows for swift report creation in its current structure, you should keep in mind that the situation may change as your company investigates new ways to use and collect information. Research indicates that despite a growing investment in big data analytics at the corporate level, most companies don’t have data-driven cultures yet.

If that’s the case at your company, executives will undoubtedly want proof that their data-related investments are paying off. More in-depth reports could reveal the progress made, but not when the data’s structure prolongs the time needed to make those reports.

Databases vs data lakes: Which should you be using?

As the transformational power of data is realised, the debate around whether to choose databases or data lakes has intensified

3. Data recycling may provide a more streamlined route to problem-solving

Companies may lean toward recycling their data when they feel confident in the validity of it and believe the information could help analysts draw more than one conclusion. When that happens, data analysts may shorten the time required to solve a problem because they can determine how the findings of one data set may aid in answering a different question more efficiently than could happen without that historical information.

4. Go with restructuring to fit different analysis needs

It’s common for people working with data to realize they cannot carry out the desired analyses on data in its current form. Restructuring often makes the data more applicable. For example, people working with IBM’s Restructure Data Wizard can use the tool to restructure data with complex structures.

The choices you make within the tool when altering the data depend on the data’s current structure. People can choose to restructure cases into variables or variables into cases, but sometimes only one of those options works for a particular kind of data analysis. As a start, if you want to analyze repeated measures, the data must have a variable group structure first.

It’s smart to think carefully about how you’ll use the data, as well as any issues with the data format preventing you from proceeding without restructuring. Then, deciding whether restructuring is worthwhile should become more straightforward.

4 steps to building a successful data-driven organisation

Andy Cotgreave, senior director at Tableau Software and co-author of The Big Book of Dashboards, provides four steps to building a data-driven organisation

5. Restructure data to use it with different programs

Companies have numerous choices when they’re picking the best programs to help them work with data. And today’s communication methods mean that people regularly work with data in one program, then send the data to someone else who may view it in a different one. In addition, when a company switches from one program to another, restructuring may be the first essential step in ensuring the data appears properly in the new one.

For example, you may have long-format Excel data and need to restructure it into the wide format used for SPSS. One option is to manually copy and paste the Excel data into SPSS, but that’s not an attractive possibility due to the potential for error and the amount of time required to work with the data in a non-automated way.

Fortunately, IBM’s Restructure Data Wizard can take care of the different formatting needed before the data shows up correctly in SPSS. You must prepare the data in Excel first, then import it into SPSS. Before doing the second step, make sure all variables appear in the first Excel row, and have the data begin on the second row. Also, you cannot have any variable names start with special characters or have spaces in them. However, replacing spaces with underscores lets SPSS recognize the data correctly.

This point demonstrates why it’s valuable to find out if restructuring data could save frustration and steps in handling data and using more than one program.

Making an appropriate choice increases the value of data

Figuring out whether to recycle or restructure data is not always simple, but choosing the right option for your company’s needs could help you get more use from the data. The list above gives you an excellent starting point.

 

Kayla Matthews

Kayla Matthews, is a tech journalist and writer.

Related Topics

Data and Analytics