From SAS to R and Back: Transferring SAS Data to R

Blog

From SAS to R and back: Transfering SAS data Into an R System

Dr. Michael Allgöwer

Published on

21.4.2016

8.5.2025

Updated on

8.5.2025

Data Science & AI

From SAS to R and back: Transfering SAS data Into an R System

SAS and R are topics which are very closely related: Both are popular tools for people like us who want to solve problems from the environment of statistic and machine learning on (more or less) large data volumes. Despite this apparent proximity, there are few touchpoints between the communities and only few persons work with both tools. As passionated `outside the box´ thinkers, we regret that and want to start a mini-series by means of this blog article in which we deal with topics which connect the both worlds, in loose order. For this first blog article, we will deal with the possibilities to exchange data between the systems. As there is a high number of ways, this article is limited to the transfer of SAS to R; the opposite direction will follow in a later article.

The transfer of SAS to R

There are various ways to transfer SAS data into an R system. We categorize them into three rough categories:

1. The generic possibilities

In this context, "generic" means that these methods generally suit for the data transfer between all possible different systems, not only R and SAS. Particularly two methods must be named here: the transfer via CSV files and the joint access to the same relational database.

The benefit of the first method is that it requires only little specific know-how. The respective commands are well known to most users: PROC EXPORT on the SAS side and read.csv in R and/or the fread command from the data.table package as better alternative, not only for large data volumes. Those who have used these commands more often know, however, that time-consuming manual precision work is frequently required until the data exchange really works. This starts with the sensible choice of the separator (which must be identical on both sides). In particular, if longer free texts are to be exchanged e.g. within text mining applications, it does not stop there by far. From the right choice of the string delimiter (which should not occur in free texts) up to the treatment of problems with encodings (in particular in case of cross-platform transfer of country-specific special characters), there are numerous possibilities to deal with apparently minor details for a long time. Thus, an unpleasant option for impatient people like me.

Joint access to the same relational database is much more pleasant than the transfer via CSV. The problems with separators, string delimiters and partly also with encodings are eliminated and data types can be transferred much easier. In addition, the database is expedient not only to transfer the data, but also to work with the latter. In addition, it is easy to choose in the course of the transfer only the particular set of data which is actually relevant. However, this method makes a lot of presuppositions. Access to the same database from both systems is often not available. However, it is frequently easier to develop than a direct connection between R and SAS, which is required by the next method.

2. The methods which use a local SAS installation

Many methods for importing SAS data into R require access to a local SAS installation. As this requirement is not fulfilled in many cases, we will not deal with this possibility in detail.

3. Direct transfer of native SAS files without recourse to an SAS installation

The SAS7BDAT format is the standard format on the SAS side. It exists in various variants; in particular, there is the possibility to activate or deactivate compression. If one tries to read these files with the R package "foreign", one gets surprised. Namely, it does not work. Though "foreign" is one of the first packages which come to mind when one wants to import foreign data formats into R, and it also supports SAS files. But, unfortunately, this support is limited to an ancient data format (SAS XPORT) which is rarely used in the SAS community anymore. The format shows almost grotesque limitations; in particular, variable names may not be longer than eight signs. Unfortunately, it is the only SAS data format for which SAS disclosed a specification which may also be the reason why the developers of foreign only provide for this format. The format is not reasonably usable; it only offers disadvantages compared to simple CSV files.

Fortunately, Matt Shotwell took on the Sisyphus task of analyzing the data format SAS7BDAT. He programmed an R package of the same name based on findings made by reverse engineering which is able to read these files (writing is not possible). Unfortunately, the package can only cope with the uncompressed variant of the data format. See below for alternatives which can also read the compressed data.

Fortunately, it is easy to deactivate the compression while creating the file: just set the option "COMPRESS=NO" in the DATA step. However, very many SAS systems write the compressed variant per default (the default behavior can be configured by means of OPTIONS). That means that in many cases, one cannot just take an SAS file and import it into R. Rather, one has to create a file which is legible in R by deactivating compression. Then, the import works reliably, although not overly quickly.

Building on Matt Shotwell's work, a Java library has been developed later which can also cope with compressed files. Matt Shotwell has made this Java library accessible in an R package named sas7bdat.parso. However, this package is a little harder to install (it requires rJava) and is not available via CRAN, but only via github. Similarly to the sas7dbat package, it is rather slow.

Thus, the best way depends on the exact situation, as so often. Personally, I would prefer the way of a jointly used database and, if that is not possible, switch to one of Matt Shotwell's two packages. If there is no other way, I would eventually make use of CSV files.

Let’s Unlock the Full Potential of Your Data – Together!

Looking to become more data-driven, optimize processes, or leverage cutting-edge technologies? Our blog provides valuable insights – but the best way to tackle your specific challenges is through a direct conversation.

Let’s talk – our experts are just one click away!

Want To Learn More? Contact Us!

Dr. Sebastian Petry

Domain Lead Data Science & AI

Who is b.telligent?

Do you want to replace the IoT core with a multi-cloud solution and utilise the benefits of other IoT services from Azure or Amazon Web Services? Then get in touch with us and we will support you in the implementation with our expertise and the b.telligent partner network.

Get to know us

The top of an office building on a bright day

All posts

No previous post

No next post

From SAS to R and back: Transfering SAS data Into an R System

Table of Contents

The transfer of SAS to R

1. The generic possibilities

2. The methods which use a local SAS installation

3. Direct transfer of native SAS files without recourse to an SAS installation

Let’s Unlock the Full Potential of Your Data – Together!

Want To Learn More? Contact Us!

Your contact person

Dr. Sebastian Petry

Who is b.telligent?

Munich

Basel

Berlin

Cluj

Dusseldorf

Frankfurt

Hamburg

Nuremberg

Vienna

Zurich

Cluj

Vienna – Postal address

Vienna – Visitor address

Basel

Zurich

Nürnberg

Frankfurt

Düsseldorf

Hamburg

Berlin

Munich

From SAS to R and back: Transfering SAS data Into an R System

Table of Contents

The transfer of SAS to R

1. The generic possibilities

2. The methods which use a local SAS installation

3. Direct transfer of native SAS files without recourse to an SAS installation

Let’s Unlock the Full Potential of Your Data – Together!

Want To Learn More? Contact Us!

Your contact person

Dr. Sebastian Petry

Who is b.telligent?

Related Posts

Snowflake Document AI – Easily Extract Data From Unstructured Documents

Neural Averaging Ensembles for Tabular Data With TensorFlow 2.0

Neural Networks for Tabular Data: Ensemble Learning Without Trees

Sizing and Scaling Azure AI Search

Munich

Basel

Berlin

Cluj

Dusseldorf

Frankfurt

Hamburg

Nuremberg

Vienna

Zurich

Cluj

Vienna – Postal address

Vienna – Visitor address

Basel

Zurich

Nürnberg

Frankfurt

Düsseldorf

Hamburg

Berlin

Munich