Stata Documentation

Philosophy

RegiStream for Stata is designed to make labeling your data as effortless as possible, allowing you to focus on your research rather than metadata management.

Traditional approaches to variable and value labels in Stata require manually writing label definitions, which is time-consuming and error-prone. RegiStream automates this entirely - simply specify your dataset domain and language, and the complete metadata is applied instantly.

The package integrates seamlessly with Stata's native labeling system, which means all existing Stata commands and workflows continue to work exactly as expected. Once labels are applied, you can use describe, codebook, tabulate, and any other Stata command with properly labeled data.

RegiStream also maintains reproducibility by supporting versioned metadata - you can pin specific versions for published research while easily staying current with the latest metadata updates for ongoing projects.

Installation

RegiStream for Stata can be installed in three different ways. Choose the method that works best for your environment.

Net Install (Recommended)

The quickest way to install RegiStream is directly from our website using Stata's net install command:

net install registream, from("https://registream.org/install/stata/latest") replace

This command will download and install the latest version of RegiStream, including all necessary .ado and .sthlp files.

SSC Install (Coming Soon)

Soon, you will be able to install RegiStream directly from the SSC repository:

ssc install registream

Note: SSC installation will be available once the package is submitted and approved by the SSC archive.

Manual Download for Offline Installation

If you are working on an offline server or a high-security environment (e.g., MONA), you can manually download the RegiStream package as a zip file:

* Download from:
* https://registream.org/get_zip/stata/latest

After downloading, follow these steps to install manually:

  1. Unzip the downloaded registream_stata.zip file on your local system
  2. Transfer all extracted files to your secure server system
  3. In Stata on the secure system, check your ado paths:
sysdir
adopath

Installation Options

Option A: If you have a PERSONAL path and can access it:

  • Create a folder named registream inside your PERSONAL ado directory
  • Move all the extracted files into this folder

Option B: If PERSONAL path doesn't exist or isn't accessible (common in MONA):

  1. Create a new folder at a location you can access, e.g., /path-to-your-folder/stata_packages/registream
  2. Move all the extracted files into this new folder
  3. At the beginning of each Stata session, add this folder to Stata's ado path:
adopath + "/path-to-your-folder/stata_packages/registream"

Tip: To avoid running adopath + every session, you can add this command to your profile.do file, which Stata runs automatically at startup.

Verification

To verify the installation was successful, run:

which autolabel

This should display the path to the autolabel.ado file.

Uninstalling

If installed via net install or ssc install:

ado uninstall registream

If installed manually, simply delete the registream folder from your ado path location.

Updating RegiStream

RegiStream provides simple commands to update both the package itself and the metadata datasets. We recommend updating regularly to get the latest features, bug fixes, and metadata improvements.

Update Package

To check for and install package updates, use any of the following equivalent commands:

* All of these check for package updates:
. registream update          // Default behavior
. registream update package  // Explicit
. autolabel update           // Delegates to registream

Interactive Update Process

When an update is available, you'll see an interactive prompt:

------------------------------------------------------------
RegiStream Package Update Check
------------------------------------------------------------

Current version: 1.1.0
Latest version:  1.2.0

A new version is available!

Would you like to update now? (y/n)
> y

Updating RegiStream...
------------------------------------------------------------
...
✓ Update successful!

What happens during update:

  • Compares your installed version with the latest version from the API
  • If update available, prompts for confirmation
  • If you answer 'y': Runs net install registream, from(https://registream.org/install/stata/latest) replace
  • If you answer 'n': Shows manual update instructions

Manual Update (Alternative Method)

You can also manually update by uninstalling and reinstalling:

* Method 1: Uninstall and reinstall via net install
. ado uninstall registream
. net install registream, from("https://registream.org/install/stata/latest")

* Method 2: Uninstall and reinstall via SSC
. ado uninstall registream
. ssc install registream

Update Datasets

To check for and download updates to your metadata datasets, use:

* Check for dataset updates:
. registream update dataset   // Singular
. registream update datasets  // Plural (same behavior)
. autolabel update dataset    // Via delegation
. autolabel update datasets   // Via delegation

Targeted Dataset Updates

You can check for updates to specific domain/language combinations:

* Update specific domain and language
. registream update dataset, domain(scb) lang(eng)

* Update all datasets for a domain
. registream update dataset, domain(scb)

* Force re-download even if current
. registream update dataset, domain(scb) lang(eng) force

Dataset Update Behavior:

  • Checks your local metadata files against the latest versions available
  • Downloads only updated datasets (not all datasets)
  • Use domain() and lang() options for targeted updates
  • Add force option to re-download even if up-to-date

Checking Current Version

To see which version of RegiStream you have installed:

* Check installed version
. which registream

* View package info
. ado describe registream

Note: The package version (e.g., 1.1.0) is hardcoded in the package itself and is the single source of truth. It is not stored in configuration files.

Quick Start

Get started with RegiStream in seconds! Here's a basic example of labeling variables in your dataset:

* Load your dataset
use mydata.dta, clear

* Label all variables with SCB metadata (English)
autolabel variables *, domain(scb) lang(eng)

* Label values for specific variables
autolabel values kon civst, domain(scb) lang(swe)

That's it! Your variables and values are now labeled with comprehensive metadata from Statistics Sweden.

Usage Guide

Labeling Variables

The autolabel variables command adds descriptive labels to your variables:

* Label all variables in English
autolabel variables *, domain(scb) lang(eng)

* Label specific variables in Swedish
autolabel variables kon alder incdispink, domain(scb) lang(swe)

* Force re-download of metadata
autolabel variables *, domain(scb) lang(eng) force

Labeling Values

The autolabel values command applies value labels to categorical variables:

* Label values for specific variables
autolabel values kon civst, domain(scb) lang(eng)

* Label all variables (if they have value labels available)
autolabel values *, domain(scb) lang(swe)

Looking Up Variables

The autolabel lookup command allows you to preview metadata for variables without applying labels to your dataset:

* Look up a single variable
autolabel lookup kon, domain(scb) lang(eng)

* Look up multiple variables
autolabel lookup carb yrkarbtyp kaross, domain(scb) lang(eng)

* Use wildcards
autolabel lookup ku*ink, domain(scb) lang(swe)

This displays the variable label, definition, type, unit, and value labels (if applicable) for each variable. If a variable doesn't exist in the domain, a warning is returned.

Tip: Use lookup to explore available metadata before deciding which variables to label in your dataset.

Command Options

Option Description
domain() Dataset domain (currently only scb available). Required.
lang() Language: eng (English) or swe (Swedish). Required.
exclude() Specify a varlist of variables to exclude from labeling
suffix() Create new labeled variables with specified suffix instead of overwriting original variables
force Force re-download of metadata files

Examples with Options

* Exclude specific variables from labeling
autolabel variables ku*ink yrkarbtyp, domain(scb) lang(eng) exclude(ku3ink)

* Create new labeled variables with suffix
autolabel variables kon alder, domain(scb) lang(eng) suffix("_lbl")

* This creates kon_lbl and alder_lbl with labels, leaving originals unchanged

* Combine options
autolabel variables *, domain(scb) lang(swe) exclude(id personid) suffix("_labeled")

Datasets

SCB Datasets

RegiStream currently supports comprehensive metadata from Statistics Sweden (SCB), including:

  • Variables (~28,000 variables) - Variable names, descriptions, definitions, units, and types
  • Value Labels - Categorical value mappings for all labeled variables
  • Languages - Available in both English (eng) and Swedish (swe)

The metadata is automatically downloaded and cached locally when you first run an autolabel command.

Custom Datasets

You can create your own custom datasets and use them with RegiStream. See the Custom Datasets Guide for detailed instructions.

Configuration

First-Run Setup

When you first run RegiStream or any of its commands (autolabel, etc.), you'll be prompted to choose a setup mode. This determines how RegiStream handles metadata downloads, updates, and data collection.

Setup Modes

1. Offline Mode
  • No internet connections
  • Manual metadata management
  • Local usage logging only (stays on your machine)

Best for: Secure environments, air-gapped systems, or users who prefer complete offline operation.

2. Standard Mode (Recommended)
  • Automatic metadata downloads
  • Automatic update checks (daily)
  • Local usage logging only
  • No online telemetry

Best for: Most users - gets convenience features while maintaining privacy.

3. Full Mode (Help improve RegiStream)
  • Everything in Standard Mode, plus:
  • Online telemetry: Sends anonymized usage data to help improve RegiStream

Best for: Users who want to contribute to RegiStream development.

You can change these settings at any time using registream config. See the RegiStream Commands section for details.

Data Storage Locations

By default, RegiStream stores metadata files in your system's user home directory:

  • macOS: /Users/<your-username>/.registream/
  • Windows: C:\Users\<your-username>\AppData\Local\registream\
  • Linux: /home/<your-username>/.registream/

This enables seamless sharing of metadata across projects and programming languages.

Secure Environments (MONA, etc.)

Some systems like MONA have RegiStream installed centrally with shared metadata that you can use directly without any additional setup. However, if you want to use your own custom metadata or maintain a separate configuration, you can specify your own directory:

* Set custom storage directory (optional)
global registream_dir "path/to/your/writable/directory"

* Then use autolabel commands as normal
autolabel variables *, domain(scb) lang(eng)

Note: Only set $registream_dir if you need to use your own metadata files. Otherwise, you can use the centrally installed version without any additional configuration.

Privacy & Telemetry

By default, RegiStream keeps a local log of your commands (like .bash_history) stored in ~/.registream/usage_stata.csv. This never leaves your machine.

If you enable "Full Mode" or explicitly opt in, RegiStream can send anonymized usage data (command name, timestamp, version info) to help improve the package. This is disabled by default.

* View current settings
registream info

* Disable telemetry
registream config, telemetry_enabled(false)

RegiStream Commands

The registream command provides utilities for managing the package, checking versions, and configuring settings.

registream info

Display current configuration and settings:

. registream info

Shows:

  • Configuration directory location
  • Current version
  • All active settings (usage_logging, telemetry_enabled, internet_access, auto_update_check)
  • Citation information

registream config

Update configuration settings. With no options, displays current settings (same as registream info).

Available Settings

All settings accept true or false:

Setting Default Description
usage_logging() true Stores command history in ~/.registream/usage_stata.csv
telemetry_enabled() false Sends anonymized usage data to registream.org
internet_access() true Allows automatic metadata downloads and update checks
auto_update_check() true Daily background check for package updates

Mode Presets

* Offline Mode
registream config, usage_logging(true) internet_access(false) telemetry_enabled(false) auto_update_check(false)

* Standard Mode (recommended)
registream config, usage_logging(true) internet_access(true) telemetry_enabled(false) auto_update_check(true)

* Full Mode (with telemetry)
registream config, usage_logging(true) internet_access(true) telemetry_enabled(true) auto_update_check(true)

* Change individual settings
registream config, telemetry_enabled(false)

registream version

Display the current version of RegiStream:

. registream version

RegiStream version 1.1.0

registream cite

Display citation information for use in publications:

. registream cite

Shows the recommended citation format along with details about datasets used. See the Citation & Authors section for the full citation.

Troubleshooting

Common Issues

Error: "Command autolabel not found"

This means RegiStream is not installed or not in Stata's ado path. Run which autolabel to check. If not found, reinstall using the installation instructions above.

Error: "Could not download metadata"

This typically means:

  • No internet connection - Check your connection or download manually
  • Firewall blocking access - Ensure registream.org is accessible
  • Custom directory not writable - Check permissions on $registream_dir

Labels not appearing

Make sure you're using the correct variable names. SCB variable names are typically lowercase. Use describe to see your current variable names.

Citation & Authors

Citing RegiStream

If you use RegiStream in your research, please cite it as:

Clark, J. & Wen, J. (2024–). RegiStream: Streamline Your Register Data Workflow. Available at: https://registream.org

For dataset-specific citations (e.g., SCB metadata), use registream cite to see the recommended format for the specific datasets you've used in your analysis.

Authors

Jeffrey Clark

Stockholm University
Email: jeffrey.clark@su.se

Jie Wen

Swedish House of Finance
Email: jie.wen@hhs.se

Support & Contribution

For issues, feedback, or contributions:

Recent Changes

Latest updates for RegiStream Stata

Loading changelog...

View Full Changelog