Calculate Readmission Rate

Calculating readmission rate

You can try this one also ( Don't know why upper one was giving false readmission flags for me):
After sorting on visit_start_date

visits_pandas_df.groupby('PatientId').apply(lambda x: (((x['visit_start_date'].shift(-1)-x['visit_end_date']).dt.days.shift(1).le(30)) ).astype(int)).values

Visits having only difference of one day are not counted in readmissions. So you will also need to check in your logic.

readmission rates in R

OK, I've got an answer for you that uses the dplyr package. I'll try to explain what's going on but you might have to read up a little bit. If you don't know the pipe-operator (%>%) - just read it as a "then".

# we need 2 libraries
library(tidyverse) # this provides several useful packages
library(lubridate) # this lets you deal more easily with dates

# first, we create a table that contains the csn of the relevant cases
# and the days since last admission

parkDf <- df %>% # we take our data frame
mutate(admit_date = as_date(admit_date)) %>% # turn the dates into date format
filter(encounter_type == "Inpatient") %>% # filter out cases with "Inpatient"
arrange(mrn, admit_date) %>% # sort them first by mrn and then by admit_date
group_by(mrn) %>% # group them by mrn so we can for each patient...
mutate(daysSinceLastAdmit = admit_date - lag(admit_date)) %>% # ...get the days since last admit
mutate(daysSinceLastAdmit = as.integer(daysSinceLastAdmit)) %>% # turn this into an integer
ungroup() %>% # ungroup it (must be done - don't ask)
select(csn, daysSinceLastAdmit) # and keep only these two columns

# now we left-join this to our original dataframe
df %>%
left_join(parkDf, by = "csn") %>%
mutate(readmission_30day = if_else(is.na(daysSinceLastAdmit), "no", "yes")) %>% # create the wanted variable
select(-daysSinceLastAdmit) # and remove the unwanted one

You can read more about the used functions if you google "R for data science" or "r4ds" - this is a book written by Hadley Wickham which is a great author. If you don't know what a left join is, just google "sql left join". It's basically: take what's on the right (our newly created dataframe), add its information to what's on the left (original dataframe) and do so by matching a column that exists in both dataframes (by).

Hope this helps.

How to calculate monthly 30-days readmission rate from historical inpatient data in SQL?

Looks, like I am a bit late, but nonetheless - here is my suggestion:

WITH mnths AS (
select cast(1 as int) m
UNION ALL select m+1 FROM mnths WHERE m<60
), admdis AS (
SELECT *, 12*(year(admdate)-2014)+month(admdate)a, 12*(year(disdate)-2014)+month(disdate) d
FROM tbl t
WHERE exists (SELECT 1 FROM tbl WHERE id=t.id AND disdate>DATEADD(day, -30, t.admdate) and sn<t.sn)
)
SELECT (m-1)/12+2014 yr, 1+(m-1)%12 mn,m, count(id) readm
FROM mnths LEFT JOIN admdis ON a=m
GROUP BY m

I set up a number table mnths to generate a regular grid of months for which I am then counting the readmissions. The CTE admdis only lists those patient admissions that happened less than 30 days after a previous discharge.

A demo can be found here: https://rextester.com/TLM57882

Edit:

When looking at forpas's solution I realised that I did not calculate the actual readmission rate. Here is a modified version that does exactly that:

WITH mnths AS (
select cast(1 as int) m
UNION ALL select m+1 FROM mnths WHERE m<60
), admdis AS (
SELECT *, 12*(year(admdate)-2014)+month(admdate)a, 12*(year(disdate)-2014)+month(disdate) d,
(SELECT 1 FROM tbl WHERE id=t.id AND disdate>DATEADD(day, -30, t.admdate) and sn<t.sn) readm
FROM tbl t
)

SELECT (m-1)/12+2014 yr, 1+(m-1)%12 mn, count(id) totl ,count(readm) readm,
case when count(id)>0 THEN (0.+count(readm))/ count(id) ELSE 0 END rate
FROM mnths LEFT JOIN admdis ON a=m
GROUP BY m

When calculating the rate I had to make sure that I did not "divide by zero", so , in those months where no admissions happened the rate is assumed to be zero (but not actually calculated).

demo: https://rextester.com/NFCXQ24711

And here is the generated result:

yr    mn totl readm rate
2014 1 3 1 0.333333333333
2014 2 0 0 0.000000000000
2014 3 0 0 0.000000000000
2014 4 0 0 0.000000000000
2014 5 0 0 0.000000000000
2014 6 0 0 0.000000000000
2014 7 0 0 0.000000000000
2014 8 0 0 0.000000000000
2014 9 0 0 0.000000000000
2014 10 0 0 0.000000000000
2014 11 0 0 0.000000000000
2014 12 0 0 0.000000000000
2015 1 0 0 0.000000000000
2015 2 0 0 0.000000000000
2015 3 0 0 0.000000000000
2015 4 0 0 0.000000000000
2015 5 0 0 0.000000000000
2015 6 0 0 0.000000000000
2015 7 0 0 0.000000000000
2015 8 0 0 0.000000000000
2015 9 0 0 0.000000000000
2015 10 0 0 0.000000000000
2015 11 0 0 0.000000000000
2015 12 0 0 0.000000000000
2016 1 0 0 0.000000000000
2016 2 0 0 0.000000000000
2016 3 0 0 0.000000000000
2016 4 0 0 0.000000000000
2016 5 0 0 0.000000000000
2016 6 0 0 0.000000000000
2016 7 0 0 0.000000000000
2016 8 0 0 0.000000000000
2016 9 0 0 0.000000000000
2016 10 0 0 0.000000000000
2016 11 0 0 0.000000000000
2016 12 1 0 0.000000000000
2017 1 0 0 0.000000000000
2017 2 0 0 0.000000000000
2017 3 0 0 0.000000000000
2017 4 0 0 0.000000000000
2017 5 0 0 0.000000000000
2017 6 1 0 0.000000000000
2017 7 0 0 0.000000000000
2017 8 0 0 0.000000000000
2017 9 0 0 0.000000000000
2017 10 0 0 0.000000000000
2017 11 0 0 0.000000000000
2017 12 0 0 0.000000000000
2018 1 1 0 0.000000000000
2018 2 1 1 1.000000000000
2018 3 0 0 0.000000000000
2018 4 0 0 0.000000000000
2018 5 0 0 0.000000000000
2018 6 0 0 0.000000000000
2018 7 0 0 0.000000000000
2018 8 0 0 0.000000000000
2018 9 0 0 0.000000000000
2018 10 1 0 0.000000000000
2018 11 0 0 0.000000000000
2018 12 0 0 0.000000000000

The column totl contains the total admissions in that period, readm the number of readmissions and rate is the ratio of readm/totl (or 0 in those cases where totl is 0).

How to mark episodes in which patient will be readmitted in 30-days?

would go with something like this:

require(tidyverse)
df %>%
arrange(patientPersonalNumber, admission.date) %>%
group_by(patientPersonalNumber) %>%
mutate(re.admin = (lag(discharge.date) + 30) >= admission.date) %>%
mutate(re.admin = ifelse(is.na(re.admin), FALSE, re.admin ))

# A tibble: 14 x 4
# Groups: patientPersonalNumber [10]
patientPersonalNumber admission.date discharge.date re.admin
<chr> <date> <date> <lgl>
1 001 2013-06-23 2013-06-25 FALSE
2 002 2013-06-30 2014-07-03 FALSE
3 004 2013-07-22 2014-08-01 FALSE
4 004 2014-07-12 2014-07-17 TRUE
5 005 2013-06-24 2013-06-30 FALSE
6 005 2013-07-02 2013-07-05 TRUE
7 005 2013-07-09 2013-07-12 TRUE
8 005 2013-09-08 2013-10-12 FALSE
9 006 2013-06-28 2013-06-30 FALSE
10 007 2013-06-29 2013-07-02 FALSE
11 008 2013-06-23 2013-06-29 FALSE
12 009 2013-06-24 2013-06-29 FALSE
13 010 2013-06-24 2013-06-27 FALSE
14 011 2013-06-24 2013-06-28 FALSE

Creating a dummy variable in R that indicates whether an ID is subsequently duplicated

We could use duplicated

df1$ReEnters <- +(duplicated(df1$ID, fromLast = TRUE))

-output

df1
# ID Year ReEnters
#1 A1 2007 0
#2 B3 2007 1
#3 B3 2009 0
#4 C6 2000 1
#5 C6 2007 1
#6 C6 2010 0

data

df1 <- structure(list(ID = c("A1", "B3", "B3", "C6", "C6", "C6"), Year = c(2007L, 
2007L, 2009L, 2000L, 2007L, 2010L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))

Program calculating tax rate per line item equaling into ExclVAT

There is a bit of a problem here with rounding, no combination of taxrates actually gives us exactly the right answer.

I've updated the code to find the combination that gives us the most accurate number that can be achieved:

from itertools import product

# get all possible tax rate combinations
x = [0.00, 0.09, 0.21]
combinations = np.array(list(product(*[x]*10)))

# get amount columns
amounts = dfcalc.filter(like='line amount')

# calculate amounts excluding VAT for each row for each tax rate combination
exclvat = amounts.fillna(0).dot((1 + combinations.T)**-1)

# for each row find the combination that gives amounts excluding VAT
# that is equal to the value in ExclVAT column for that row
ix = np.abs(exclvat.sub(dfcalc['ExclVAT'].squeeze(), 0)).idxmin(1)
taxrates = np.where(amounts.notna(), combinations[ix], np.nan)

# subtract tax from line amounts
dfcalc[amounts.columns] /= (1 + taxrates)
dfcalc['line amount sum'] = dfcalc.filter(like='line amount').sum(1)
dfcalc.T

Output:

                         0
line amount#1 4.917355
line amount#2 4.917355
line amount#3 14.449541
line amount#4 14.449541
line amount#5 3.223140
line amount#6 2.396694
line amount#7 NaN
line amount#8 NaN
line amount#9 NaN
line amount#10 NaN
BTW 5.850000
ExclVAT 44.350000
Totaal 50.200000
line amount sum 44.353628


Related Topics



Leave a reply



Submit