Split street address into street number and street name in r
you can try:
y <- lapply(strsplit(x, "(?<=\\d)\\b ", perl=T), function(x) if (length(x)<2) c("", x) else x)
y <- do.call(rbind, y)
colnames(y) <- c("Street Number", "Street Name")
hth
Separate street name strings with street number and letter python
IIUC, you can use this regex:
df[1].str.extract('(\D+)\s+(\d+)\s?(.*)')
Output:
0 1 2
0 SØNDRE VEI 54
1 UTSIKTVEIEN 20 B
2 KAARE MOURSUNDS VEG 14 A
3 OKSVALVEIEN 19
4 SLEMDALSVINGEN 33 A
5 GAMLESTRØMSVEIEN 59
6 JONAS LIES VEI 68 A
Grouping by street address and splitting it into street name and number
I think you need extract
:
df = pd.DataFrame({'street address': ['500 wall street', '123 blafoo']})
print (df)
street address
0 500 wall street
1 123 blafoo
df1 = df['street address'].str.extract('(?P<number>\d+)(?P<name>.*)', expand=True)
print (df1)
number name
0 500 wall street
1 123 blafoo
Solution with split
:
df[['number','name']] = df['street address'].str.split(n=1, expand=True)
print (df)
street address number name
0 500 wall street 500 wall street
1 123 blafoo 123 blafoo
Splitting pandas address into street and house numbers
From your data, you can do:
df.address.str.extract('(?P<Street>\D+) (?P<Number>\d+.*)')
Output:
Street Number
0 Rue de blabla 20
1 Vossenstraat 7
2 Rue Père Jean 3 boite Z
3 Rue XSZFEFEF 331
Remember this will fail if you have number in your street name, e.g. 5th avenue
.
How to split a street address after the first number?
If you always want to split after the first occurrence of a number, you may use Regular Expression for that.
Here's a full example:
string input = "North Street 57A 1floor";
var regex = new Regex(@"(?<=\d)(?=\D)");
var parts = regex.Split(input, 2);
foreach (var part in parts)
Console.WriteLine(part);
Output:
North Street 57
A 1floor
The pattern (?<=\d)(?=\D)
gets the position after a string of digits. Then, we use Regex.Split(string input, int count)
where count=2
to ensure that it returns two parts at maximum.
Try it online.
Splitting a column into two
This is how I addressed it. I hope this will help.
Create Database TestDB
go
USE TestDB
GO
--Create Sample Table CustomerAddress
create table CustomerAddress(Address char(100))
go
insert into CustomerAddress values('123 Main St')
insert into CustomerAddress values('XYZ St')
insert into CustomerAddress values(' abc')
select * from CustomerAddress
--Option #1a - Split Address column, when no street number replace with empty value
SELECT
Street_Number =
CASE WHEN (ISNUMERIC(LEFT(Address, 1)) = 1) THEN LEFT(Address, CHARINDEX(' ', Address))
ELSE ''
END ,
Street_Name =
CASE WHEN (ISNUMERIC(LEFT(Address, 1)) = 1) THEN substring(Address, CHARINDEX(' ', Address) + 1, len(Address) - (CHARINDEX(' ', Address) - 1))
ELSE Address
END
FROM [dbo].CustomerAddress;
--Option #1b - Split Address column, when no street number replace with NULL
SELECT
Street_Number =
CASE WHEN (ISNUMERIC(LEFT(Address, 1)) = 1) THEN LEFT(Address, CHARINDEX(' ', Address))
ELSE NULL
END ,
Street_Name =
CASE WHEN (ISNUMERIC(LEFT(Address, 1)) = 1) THEN substring(Address, CHARINDEX(' ', Address) + 1, len(Address) - (CHARINDEX(' ', Address) - 1))
ELSE Address
END
FROM [dbo].CustomerAddress;
--Option #2a - Use LIKE % instead of ISNUMERIC, we may get better performance
SELECT
Street_Number = CASE WHEN (Address LIKE '[0-9]%') THEN LEFT(Address, CHARINDEX(' ', Address))
ELSE NULL
END ,
Street_Name = CASE WHEN (Address LIKE '[0-9]%') THEN substring(Address, CHARINDEX(' ', Address) + 1, len(Address) - (CHARINDEX(' ', Address) - 1))
ELSE Address
END
FROM [dbo].CustomerAddress;
--Clean up by dropping the table
drop table [dbo].CustomerAddress
go
Related Topics
How to Create a Consecutive Group Number
How to Add a Diagonal Line to a Plot
Join 3 Columns of Different Lengths in R
Why Are These Numbers Not Equal
Combine a List of Data Frames into One Data Frame by Row
Drop Data Frame Columns by Name
Convert Data.Frame Columns from Factors to Characters
Filter Multiple Values on a String Column in Dplyr
Extracting Specific Columns from a Data Frame
Converting Data Frame into a List of Lists in R
Using Ggplot2, How to Insert a Break in the Axis
Replacing Na Values from Another Dataframe by Id
How to Sum a Variable by Group
Add Regression Line Equation and R^2 on Graph
How to Disable Scientific Notation
Error in ≪My Code≫: Object of Type 'Closure' Is Not Subsettable
Combine (Rbind) Data Frames and Create Column With Name of Original Data Frames