I have never needed to use fixed-format ASCII data in Stata; I typically work with CSV files. However, I was looking for data to replicate this paper by Bill Greene on a general method to incorporate selectivity into limited dependent variable models.
I was about to write to him for the data, but decided to take a quick look through his website to see if he did provide the data. Turns out he provides a subset of that dataset, Table F25.1: Expenditure and Default Data, 1319 observations, as part of the example datasets of the 6th edition of his massively bestselling Econometric Analysis.
Small problem is that that dataset is a fixed-format text file and probably formatted as an Nlogit/Limdep dataset and reading it into Stata is not straightforward.
So, I decided to figure out how to write a dictionary file and read that data in using Stata's -infile- command. The dictionary file looks like this:
dictionary { _first(4) * first line of data is the fourth _lines(3) * there are three lines of data per observation _line(1) * begin with line one of each observation Cardhldr "Dummy variable, 1 if application for credit card accepted, 0 if not" Majordrg "Number of major derogatory reports" Age "Age n years plus twelfths of a year" Income "Yearly income (divided by 10,000)" Exp_Inc "Ratio of monthly credit card expenditure to yearly income" _newline * move to the next line of an observation Avgexp "Average monthly credit card expenditure" Ownrent "1 if owns their home, 0 if rent" Selfempl "1 if self employed, 0 if not." Depndt "1 + number of dependents" Inc_per "Income divided by number of dependents" _newline * move to the next (last) line of an observation Cur_add "months living at current address" Major "number of major credit cards held" Active "number of active credit accounts" }Save this file as "limdep2stata.dct". Then, this dictionary file can be used to read in the data using a do-file which looks like this:
/* * Read in LIMDEP data in Stata */ infile using limdep2stata.dct, using(TableF25-1.txt) clear renvars _all, lower drop if cardhldr==. // one extra line read inI guess the data file lacks an end-of-file delimiter and so an extra line is read in before Stata figures out that the file has ended. I will see if there is a simple solution to avoid this. But it does no harm and the extra line is easily dropped.
And that's it! You are good to go.
PS. I must mention that the command -renvars- is due to Nick Cox and Jeroen Weesie.
No comments:
Post a Comment