Calculate industry concentration based on four biggest numbers

北城以北 提交于 2020-01-15 09:10:35

问题


I am trying to find the four biggest numbers of a variable in Stata, as I want to calculate the industry concentration of different groups based on sales. I have firms sales from multiple years and the firms belong to different groups based on industries and countries.

Thus, I would like to find:

industry concentration = (4 biggest sales-values of a year of one industry-&-country-group) / sum of all sales for one year of the industry-&-country-group)

I have about 10000 firms for about 10 years:

firms   country   year   industry   sales  
    a       usa      1          1     300  
    a       usa      2          1    4000  
    b       ger      1          1     200  
    b       ger      2          1     400  
    c       usa      1          1     100  
    c       usa      2          1     300  
    d       usa      1          1     400  
    d       usa      2          1     200  
    e       usa      1          1    7000  
    e       usa      2          1     900  
    f       ger      1          2     100  
    f       ger      2          2     700  
    h       ger      1          2     700  
    h       ger      2          2     600   

I know how to find the sum of sales per industry-country-year-group:

bysort country industry year: egen sum_sales = sum(sales)

回答1:


The sum of the four biggest is

bysort country industry year (sales): generate four_biggest_sales = sales[_N] + ///
                                      sales[_N-1] + sales[_N-2] + sales[_N-3] 

provided that no values of sales are missing. If there are only three values then you'd need

max(0, sales[_N-3]) 

with similar corrections for the cases of two values, one value or none.

This all follows from basic syntax for the by prefix. See this article on Stata Journal for a tutorial.

If there are missings, then they can be segregated by

generate isnotmiss = !missing(sales) 
bysort isnotmiss country industry year (sales): generate four_biggest_sales = sales[_N] + ///
                                                sales[_N-1] + sales[_N-2] + sales[_N-3] 


来源:https://stackoverflow.com/questions/17771069/calculate-industry-concentration-based-on-four-biggest-numbers

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!