Error connecting to azure blob storage API from R
Looks like your problem is with the key. The string of the key you have provided is actually base64 encoded. You need to decode that to the raw vector before you use it to sign the request. For example:
url<-"https://preconstuff.blob.core.windows.net/pings?restype=container&comp=list"
sak<-"Q8HvUVJLBJK+wkrIEG6LlsfFo19iDjneTwJxX/KXSnUCtTjgyyhYnH/5azeqa1bluGD94EcPcSRyBy2W2A/fHQ=="
requestdate<-format(Sys.time(),"%a, %d %b %Y %H:%M:%S %Z", tz="GMT")
signaturestring<-paste0("GET",paste(rep("\n",12),collapse=""),
"x-ms-date:",requestdate,"
x-ms-version:2009-09-19
/preconstuff/pings
comp:list
restype:container")
headerstuff<-add_headers(Authorization=paste0("SharedKey preconstuff:",
RCurl::base64(digest::hmac(key=RCurl::base64Decode(sak, mode="raw"),
object=enc2utf8(signaturestring),
algo= "sha256", raw=TRUE))),
`x-ms-date`=requestdate,
`x-ms-version`= "2009-09-19")
content(GET(url,config = headerstuff, verbose() ))
There are no more authentication errors this way, though no blobs are listed. Perhaps that's a different issue.
Also, I changed the way the date/time was created to more "safely" change the local time to GMT.
Azure PUT Blob authentication fails in R
I managed to resolve this issue by putting the "\n" characters and everything in the right place.
Based on Gaurav Mantri's help, I used:
https://learn.microsoft.com/en-us/rest/api/storageservices/authentication-for-the-azure-storage-services
The following changes in the 'signature_string' worked:
signature_string <- paste0("PUT", "\n", # HTTP Verb
"\n", # Content-Encoding
"\n", # Content-Language
content_length, "\n", # Content-Length
"\n", # Content-MD5
"text/plain", "\n", # Content-Type
"\n", # Date
"\n", # If-Modified-Since
"\n", # If-Match
"\n", # If-None-Match
"\n", # If-Unmodified-Since
"\n", # Range
# Here comes the Canonicalized Headers
"x-ms-blob-type:BlockBlob","\n",
"x-ms-date:",requestdate,"\n",
"x-ms-version:2015-02-21","\n",
# Here comes the Canonicalized Resource
"/",account, "/",container,"/", filename)
Connecting to Azure Blob Storage from R with SAS-URI only
I believe you're getting this error is because you're trying to use a blob container URL to list blob containers.
Please try by changing your code to something like:
library(AzureStor)
end_point <- blob_endpoint("https://storagename.blob.core.windows.net/",
sas = "sv=2018-xxxx0SCdi8aO6%2FyYzT0dHHPca0KhyNrFHtE%3D")
list_blob_containers(end_point)
Please do note that listing blob containers using SAS token would require you to get an Account SAS
token with list
permission at the Service
level. At the very least, your SAS token should have:
Signed Service (ss
): Blob Service (b
)
Signed Resource Types (srt
): Service (s
)
Signed Permission (sp
): List (l
)
If you do not have these, your list blob containers operation will fail.
HTTP/1.1 400 error when trying to connect to an Azure Table with R
Thank you for your help!
So this is the code that works:
library(httr)
library(RCurl)
library(bitops)
library(xml2)
# Stores credentials in variable
Account <- "storageaccount"
Container <- "Usage"
Key <- "key"
# Composes URL
URL <- paste0(
"https://",
Account,
".table.core.windows.net",
"/",
Container
)
# Requests time stamp
requestdate <- format(Sys.time(), "%a, %d %b %Y %H:%M:%S %Z", tz = "GMT")
# As per Microsoft's specs, an empty line is needed for content-length
content_lenght <- 0
# Composes signature string
signature_string <- paste0(
"GET", "\n", # HTTP Verb
"\n", # Content-MD-5
"text/xml", "\n", # Content-Type
requestdate, "\n", # Date
"/", Account, "/", Container # Canonicalized resource
)
# Composes header string
header_string <- add_headers(
Authorization=paste0(
"SharedKey ",
Account,
":",
RCurl::base64(
digest::hmac(
key = RCurl::base64Decode(
Key, mode = "raw"
),
object = enc2utf8(signature_string),
algo = "sha256",
raw = TRUE
)
)
),
'x-ms-date' = requestdate,
'x-ms-version' = "2020-12-06",
'Content-type' = "text/xml"
)
# Creates request
xml_body = content(
GET(
URL,
config = header_string,
verbose()
),
"text"
)
Get_data <- xml_body # Gets data as text from API
From_JSON <-fromJSON(Get_data, flatten = TRUE) # Parses text from JSON
Table_name <- as.data.frame(From_JSON) # Saves data to a table
I added the final three lines to parse the data from JSON format and saved into a table.
Access Azure blob storage from R notebook
The AzureStor package provides an R interface to Azure storage, including files, blobs and ADLSgen2.
endp <- storage_endpoint("https://acctname.blob.core.windows.net", key="access_key")
cont <- storage_container(endp, "mycontainer")
storage_download(cont, "myblob.csv", "local_filename.csv")
Note that this will download to a file in local storage. From there, you can ingest into Spark using standard Sparklyr methods.
Disclaimer: I'm the author of AzureStor.
Access Azure Blob Storage through R
For your information, I have been informed that R is not capable of doing the actual mounting. The workaround is to mount using another language like Python and read the file using the library "SparkR" as shown below.
The two most commonly used libraries that provide an R interface to Spark are SparkR and sparklyr. Databricks notebooks and jobs support both packages, although you cannot use functions from both SparkR and sparklyr with the same object.
Mount using Python:
Run R notebook using the library “SparkR”:
Related Topics
Note or Warning from Package Check When Readme.Md Includes Images
How to Convert Camelcase to Not.Camel.Case in R
Why Do Rapply and Lapply Handle Null Differently
Split Data.Frame into Groups by Column Name
How to Let R Use All the Cores of the Computer
Including Images in R-Package Documentation (.Rd) Files
Change from Date and Hour Format to Numeric Format
Inline Function Code Doesn't Compile
Divide All Columns by a Chosen Column Using Mutate_All
How to Know a Function or an Operation in R Is Vectorized
Plotting Pie Charts in Ggplot2
Use Object Names as List Names in R
How to Control Label Color Depending on Fill Darkness of Bars
Ess to Call Different Installations of R
How to Reverse the Order of a Dataframe in R
How to Select_If in Dplyr, Where the Logical Condition Is Negated
Ggplot2: Cannot Color Area Between Intersecting Lines Using Geom_Ribbon