Prime number function in R
A number a
is divisible by a number b
if the result of the division a / b
is equal to the result of the integer division a %/% b
. Any integer pn
can be divided by at least two numbers: 1
and pn
. Prime numbers are those than can only be divided by those two. Breaking out the code:
pn / 1:pn
are the results of the divisions by1
,2
, ...,pn
pn %/% 1:pn
are the results of the integer divisions by1
,2
, ...,pn
sum(pn / 1:pn == pn %/% 1:pn)
are how many of these are equal, i.e., the number of integer divisors ofpn
. If this number is2
, you have a prime.
What was wrong with your code: if
needs to test if something is TRUE
or FALSE
but you were passing it a whole vector. Also, your logic was wrong. It should have been:
is.prime <- function(num) {
if (num == 2) {
TRUE
} else if (any(num %% 2:(num-1) == 0)) {
FALSE
} else {
TRUE
}
}
And once you've settled on returning a logical, you can make your code a lot shorter:
is.prime <- function(n) n == 2L || all(n %% 2L:max(2,floor(sqrt(n))) != 0)
(which incorporates @Carl's comment about not checking all numbers.)
Implementing function to detect if a number is prime using any() in R
As you have stated, we can easily call an isprime
function from many packages to obtain your answer quickly, but that is not your aim here. You are learning and trying to understand basic principals in R
and programming in general. Many kudos!!
As many have pointed out, the use of range
here is inappropriate in this situation. This function simply returns the minimum and maximum of a given vector (See ?range
for more info). So for your example of 55, since range(2:54)
returns 2
and 54
, 55 is not divisible by either of these numbers, hence the return of TRUE
.
As noted by @LAP, you need the colon :
operator. So if you change range(2:(n-1))
to simply 2:(n-1)
, your algorithm will work (for the most part... it won't work for n <= 2
).
Since you state that you are in the process of learning, you should also think about ways of coding in a more efficient and thoughtful manner. In order to check whether a number is prime, we only need to check up to the square root of n. This will reduce many checks.
Let's also think a little harder on how we can avoid further checks. We know when checking for primality, we only need to consider prime numbers. Since 2 is the only even prime number, we could first check if the input is even, and if it isn't, only check divisibility by odd numbers only. We can achieve the last bit by generating a sequence with fixed steps by calling the function seq
or seq.int
.
Let's see this in action:
## Pseudo-Corrected OP function (2 should return TRUE, 1 should return FALSE, etc.)
prime2 <- function(n) {
rangeOfNumbers <- 2:(n-1)
if(any(n%%rangeOfNumbers == 0)){
return(FALSE)
}
else return(TRUE)
}
prime2Enhanced <- function(n) {
if (n < 2)
return(FALSE)
else if (n %in% c(2, 3, 5, 7))
return(TRUE)
else if (n %% 2 == 0)
return(FALSE)
else if (any(n %% seq.int(3, sqrt(n), 2) == 0))
return(FALSE)
else
return(TRUE)
}
all.equal(sapply(3:1000, prime2), as.logical(gmp::isprime(3:1000)))
[1] TRUE
all.equal(sapply(3:1000, prime2Enhanced), as.logical(gmp::isprime(3:1000)))
[1] TRUE
We can do even better by looking at extending our programming concepts to vectorization. This is when we can pass a vector and get the result all at one time instead of using a loop. This is a very important concept in R
and I highly recommend learning about it. The R Inferno is an excellent resource for such a topic (see Circle 3).
prime2Vectorized <- function(v) {
result <- rep(TRUE, length(v))
testVec <- as.integer(c(2, seq(3, sqrt(max(v)), 2)))
## Although we are using a loop here, we are taking
## advantage of vectorization concepts with each x
for (x in testVec) {
s <- which(v >= x^2)
result[s[v[s] %% x == 0]] <- FALSE
}
result
}
all.equal(prime2Vectorized(3:1000), as.logical(gmp::isprime(3:1000)))
[1] TRUE
I will save it as an exercise to handle edge cases and further optimization.
Now, let's see how much improvements we have in efficiency using the library microbenchmark
:
library(microbenchmark)
microbenchmark(orig = sapply(3:1000, prime2),
improved = sapply(3:1000, prime2Enhanced),
vectorize = prime2Vectorized(3:1000))
Unit: microseconds
expr min lq mean median uq max neval
orig 3863.279 4198.5595 5470.2178 4403.5050 4723.485 15775.377 100
improved 1670.418 1740.1240 1937.2396 1809.6680 1935.365 9912.629 100
vectorize 202.006 218.5505 306.4068 233.9045 249.696 6243.121 100
Hopefully you can see that there are many fundamental concepts that can be taken away from this simple exercise. If you really want to learn about primality tests, you should check out Miller-Rabin primality test which is implemented in gmp
, numbers
, and other packages.
prime numbers in R
(I already voted to close this question as duplicated. However, I post an answer to show how to use the codes provided in the reference post).
Using the question and answers from R: Prime number function
Having n
a vector of 100 values:
n <- 1:100
# First function:
is.prime1 <- function(num) {
if (num == 2) {
TRUE
} else if (any(num %% 2:(num-1) == 0)) {
FALSE
} else {
TRUE
}
}
Test for unique number:
is.prime1(1)
#[1] FALSE
is.prime1(2)
#[1] TRUE
But you need to Vectorize
this function in order to use a vector of values as input:
is.prime1 <- Vectorize(is.prime1)
Which gives
is.prime1(n)
# [1] FALSE TRUE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE
# [20] FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
# [39] FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
# [58] FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
# [77] FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
# [96] FALSE TRUE FALSE FALSE FALSE
Now, the prime numbers:
n[is.prime1(n)]
# [1] 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
Same with:
# Seconde function:
is.prime2 <- function(n) n == 2L || all(n %% 2L:ceiling(sqrt(n)) != 0)
is.prime2 <- Vectorize(is.prime2)
n[is.prime2(n)]
# [1] 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
Using isprime
from matlab
package:
library(matlab)
n[as.logical(isprime(n))]
# [1] 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97
Print all prime numbers less than 50
It is more similar to the code in the question.
prime = 0:50
for(val in prime){
if (val < 2)
next
else {
f = FALSE
for (temp in 2:sqrt(50))
if (val %% temp == 0 && val > temp){
f = TRUE
break
}
if (f) next
}
print(val)
}
how to write a function to find prime numbers in a range in RStudio
For small numbers (less than 1 million), implement trial division by primes less than 1000:
primes_less_than_1000 = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613,
617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997]
def is_prime(n):
factor = 0
if n in primes_less_than_1000:
return 1
if n > 1000:
for p in primes_less_than_1000:
if (n%p==0):
factor = 1
return 0
if (factor==0):
return 1
else:
return 0
Now that we have a prime-determining function, we can test any range below 1 million:
def primes_in_range(lower,upper):
primes = []
for n in range(lower,upper):
if is_prime(n)==1:
primes.append(n)
return primes
For example:
print(primes_in_range(1000,1100))
>>> [1009, 1013, 1019, 1021, 1031, 1033, 1039, 1049, 1051, 1061, 1063, 1069, 1087, 1091, 1093, 1097]
If you want primes more than 1 million, trial divison will take up more memory, so you should look at https://primes.utm.edu/ for other prime or probable prime algorithms.
Related Topics
Convert Data.Frame Column to a Vector
Adding Space Between Bars in Ggplot2
Create Categorical Variable in R Based on Range
Converting Nested List to Dataframe
Ggplot2 Multiple Sub Groups of a Bar Chart
Error ".Onload Failed in Loadnamespace() for 'Tcltk'"
How to Specify the Actual X Axis Values to Plot as X Axis Ticks in R
Generate Random Numbers with Fixed Mean and Sd
How to Check If Object (Variable) Is Defined in R
Add New Row to Dataframe, at Specific Row-Index, Not Appended
Using a Pre-Defined Color Palette in Ggplot
Anova Test Fails on Lme Fits Created with Pasted Formula
Align Ggplot2 Plots Vertically
Converting Two Columns of a Data Frame to a Named Vector
Evaluating Both Column Name and the Target Value Within 'J' Expression Within 'Data.Table'