[Stata] Converting xlsx, csv, and sav format to Stata dta files

You may have encountered the need to import data from different sources into Stata. In this blog post, I am going to introduce how to import data from Excel (.xlsx), comma-separated values (.csv), and SPSS (.sav) files into Stata dat format.

Import excel format (xlsx)

Here is a possible blog post on how to import xlsx, csv, spss files in Stata:

If you are working with data analysis, you may have encountered the need to import data from different sources into Stata. In this blog post, we will show you how to import data from Excel (.xlsx), comma-separated values (.csv), and SPSS (.sav) files into Stata using the built-in commands and the graphical user interface.

Importing data from Excel files

Excel files are one of the most common formats for storing and exchanging data. Stata can directly import data from Excel files with the extension .xls or .xlsx. There are two ways to do this: using the command import excel or using the menu option File > Import > Excel Spreadsheet.

The command import excel has the following syntax:

Stata
import excel using filename.xlsx, sheet("Sheet 1") firstrow case(lower)

where varlist is an optional list of variables to import, filename is the name of the Excel file, and options are additional arguments that control how the data are imported. Some of the options are as follows.

  • sheet("sheetname"): specify the name of the worksheet to import. If not specified, the first worksheet is imported by default.
  • cellrange(start:end): specify the range of cells to import. For example, cellrange(A1:G10) imports data from cells A1 to G10.
  • firstrow: treat the first row of data as variable names. If not specified, Stata will generate default variable names such as v1, v2, etc.
  • case(preserve|lower|upper): preserve the case of variable names (the default), or convert them to lowercase or uppercase when using firstrow.
  • allstring("format"): import all data as strings, optionally specifying a numeric display format.

Import csv format (csv)

CSV files are plain text files that store data in a tabular format, where each row represents an observation and each column represents a variable. The values are separated by commas or other delimiters. Stata can import data from CSV files with the command import delimited or the menu option File > Import > Text Data.

The command import delimited has the following syntax:

Stata
import delimited using filename.csv, varnames 

where varlistfilename, and options are similar to those of import excel. Some of the options are:

  • delim(string): specify the delimiter used in the CSV file. The default is comma (,), but other common delimiters are tab (\t), semicolon (;), and pipe (|).
  • encoding(name): specify the character encoding of the CSV file. The default is UTF-8, but other common encodings are Latin-1, Windows-1252, and ASCII.
  • varnames: treat the first row of data as variable names. If not specified, Stata will generate default variable names such as v1, v2, etc.
  • stringcols(colist): specify a list of columns to import as strings. For example, stringcols(3 5 7) imports columns 3, 5, and 7 as strings.

Import spss format (sav)

SPSS files are binary files that store data and metadata in a proprietary format used by IBM SPSS Statistics software. Stata can import data from SPSS files with the extension .sav or .zsav (compressed) using the command import spss or the menu option File > Import > SPSS Data.

The command import spss has the following syntax:

Stata
import spss using filename, case(lower)

where varlistfilename, and options are similar to those of import excel. Some of the options are as follows.

  • describe: list the available sheets and ranges of the SPSS file without importing the data.
  • clear: replace the data in memory with the imported data. If not specified, Stata will append the imported data to the existing data.
  • locale("locale"): specify the locale used by the SPSS file. This option has no effect on Windows, but may be needed on Mac or Linux to handle special characters.

  • July 1, 2023