Destring a time variable using Stata

ぐ巨炮叔叔 提交于 2020-01-07 11:29:14

问题


How to destring a time variable (7:00) using Stata?

I have tried destring: however, the : prevents the destring. I then tried destring, ignore(:) but was unable to then make a double and/or format %tc. encode does not work; recast does not do the job.

I also have a separate string date that I was able to destring and convert to a double.

Am I missing that I could be combining these two string variables (one date, one time) into a date/time variable or is it correct to destring them individually and then combine them into a date/time variable?


回答1:


Short answer

To give the bottom line first: two string variables that hold date and time information can be converted to a single numeric date-time variable using some operation like

 generate double datetime = clock(date + time, "DMY hm") 
 format datetime %tc 

except that the exact details will depend on exactly how your dates are held.

For understanding dates and times in Stata there is no substitute for

help dates and times

Everything else tried is likely to be wrong or irrelevant or both, as your experience shows.

Longer answer, addressing misconceptions

destring, encode and recast are all (almost always) completely wrong in Stata for converting string dates and/or times to numeric dates and/or times. (I can think of one exception: if somehow a date in years had been imported as string with values "1960", "1961", etc. then destring would be quite all right.)

In reverse order,

  • recast is not for any kind of numeric to string or string to numeric conversion. It only recasts among numeric or among string types.

  • encode is essentially for mapping obvious strings to numeric and (unless you specify otherwise) will produce integer values 1, 2, 3, and so forth which will be quite wrong for times or dates in general.

  • destring as you applied it implies that the string times "7:00", "7:59", "8:00" should be numeric, except that someone stupidly added irrelevant punctuation. But if you strip the colons :, you get times 700, 759, 800, etc. which will not match the standard properties of times. For example, the difference between "8:00" and "7:59" is clearly one minute, but removing the informative punctuation would just yield numbers 800 and 759, which differ by 41, which makes no sense.

For a pure time, you can set up your own system, or use Stata's date-time functions.

For a time between "00:00" and "23:59" you can use Stata's date-times:

. di %tc clock("7:00", "hm")
01jan1960 07:00:00

. di %tc_HH:MM clock("7:00", "hm")
 07:00

With variables you would need to generate a new variable and make sure that it is created as double.

A pure time less than 24 hours is (notionally) a time on 1 January 1960, but you can ignore that. But you need to hold in mind (constantly!) that the underlying numeric units are milliseconds. Only the format gives you a time in conventional terms.

If you have times more than 24 hours, that is probably not a good idea.

Your own system could just be to convert string times in the form "hh:mm" to minutes and do calculations in those terms. For times held as variables, the easiest way forward would be to use split, destring to produce numeric variables holding hours and minutes and then use 60 * hours + minutes.

However, despite your title, the real problem here seems to be dealing jointly with date and time information, not just time information, so at this point, you might like to read the short answer again.



来源:https://stackoverflow.com/questions/30506490/destring-a-time-variable-using-stata

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!