Extract first N digits from a string
Solution using regex \\D
to match non-digit characters and \\d{2}
to match first two digits.
as.numeric(sub("\\D*(\\d{2}).*", "\\1", INPUT))
# [1] 55 19 24 24
data:
INPUT <- c("ABC Conference Room Monitor - Z5580J",
"ABC 19 Monitor",
"ABC 24 Monitor for Video-Conferencing",
"ABC UltraSharp 24 Monitor -QU2482Z")
How do I get the first n characters of a string without checking the size or going out of bounds?
Here's a neat solution:
String upToNCharacters = s.substring(0, Math.min(s.length(), n));
Opinion: while this solution is "neat", I think it is actually less readable than a solution that uses if
/ else
in the obvious way. If the reader hasn't seen this trick, he/she has to think harder to understand the code. IMO, the code's meaning is more obvious in the if
/ else
version. For a cleaner / more readable solution, see @paxdiablo's answer.
Get the first numbers from a String on Java
This method can be used to return a number before another character is found
static int getNumbers(String s) {
String[] n = s.split(""); //array of strings
StringBuffer f = new StringBuffer(); // buffer to store numbers
for (int i = 0; i < n.length; i++) {
if((n[i].matches("[0-9]+"))) {// validating numbers
f.append(n[i]); //appending
}else {
//parsing to int and returning value
return Integer.parseInt(f.toString());
}
}
return 0;
}
Usage:
getNumbers(s);
getNumbers(str);
Output:
5634
78695
How to extract first 8 characters from a string in pandas
You are close, need indexing with str
which is apply for each value of Serie
s:
data['Order_Date'] = data['Shipment ID'].str[:8]
For better performance if no NaN
s values:
data['Order_Date'] = [x[:8] for x in data['Shipment ID']]
print (data)
Shipment ID Order_Date
0 20180504-S-20000 20180504
1 20180514-S-20537 20180514
2 20180514-S-20541 20180514
3 20180514-S-20644 20180514
4 20180514-S-20644 20180514
5 20180516-S-20009 20180516
6 20180516-S-20009 20180516
7 20180516-S-20009 20180516
8 20180516-S-20009 20180516
If omit str
code filter column by position, first N values like:
print (data['Shipment ID'][:2])
0 20180504-S-20000
1 20180514-S-20537
Name: Shipment ID, dtype: object
Extract the first 2 Characters in a string
You can just use the substr
function directly to take the first two characters of each string:
x <- c("75 to 79", "80 to 84", "85 to 89")
substr(x, start = 1, stop = 2)
# [1] "75" "80" "85"
You could also write a simple function to do a "reverse" substring, giving the 'start' and 'stop' values assuming the index begins at the end of the string:
revSubstr <- function(x, start, stop) {
x <- strsplit(x, "")
sapply(x,
function(x) paste(rev(rev(x)[start:stop]), collapse = ""),
USE.NAMES = FALSE)
}
revSubstr(x, start = 1, stop = 2)
# [1] "79" "84" "89"
How to get first n characters from a string in R
library(stringr)
library(dplyr)
df$name %>%
str_extract_all("(?<=(^|[:space:]))[:alpha:]{3}") %>%
map_chr(~ str_c(.x, collapse = "_"))
The stringr
cheatsheet is very useful for working through these types of problems.
https://www.rstudio.com/resources/cheatsheets/
Created on 2022-03-26 by the reprex package (v2.0.1)
extract first N digits from string with regex
How about:
$str = preg_replace('/^(\d+).*$/', "$1", $str);
Related Topics
Convert Unicode to Readable Characters in R
What Happens When Prob Argument in Sample Sums to Less/Greater Than 1
When/How/Where Is Parent.Frame in a Default Argument Interpreted
Ggplot2 Positive and Negative Values Different Color Gradient
Adding an Image to Shiny Action Button
Read List of File Names from Web into R
Ggplot and Axis Numbers and Labels
Error Installing R Package for Linux
R Plotly: Preserving Appearance of Two Legends When Converting Ggplot2 with Ggplotly
Remove Certain Words in String from Column in Dataframe in R
Dynamic Number of Calls to a Chunk with Knitr
Combining Pipes and The Magrittr Dot (.) Placeholder