How to Clear an Na Flag for a Posix Value

How do I clear an NA flag for a posix value?

Using a time zone without daylight saving time fixes this kind of problems for me.

a_var_posixlt = strptime(as.character( a_var$Date) , '%m/%d/%Y %H:%M',tz="GMT")

as.POSIXct gives inexplicable NA value

The problem here is that this is the time you switch to summer time, so you need to specify the time zone, otherwise there is ambiguity.
If you specify a time zone, it will work:

as.POSIXct('2017-03-26 02:00:00', format = "%Y-%m-%d %H:%M:%S", tz = "GMT")

Which returns:
"2017-03-26 02:00:00 GMT"

You can check ?timezones for more information.

Why do some dates become NA when converted from character to POSIXlt?

I updated my code to specify the GMT timezone as the data is collected in GMT without a change to or from daylight savings time.

dateValue <- strptime(dateString, format='%m/%d/%y %I:%M:%S %p', tz="GMT")

This ensures properly formatted date time values are not evaluated to TRUE with is.na()

Can advice of posix_fadvise() be combined?

Implementation

According to this implementation of fadvise I found, there is a switch applied to the advice flag. You can see that attributes like the read-ahead page count file->f_ra.ra_pages does get "switched" depending on the selected flag. But other caching related function calls aren't (force_page_cache_readahead).

switch (advice) {
case POSIX_FADV_NORMAL:
file->f_ra.ra_pages = bdi->ra_pages;
spin_lock(&file->f_lock);
file->f_mode &= ~FMODE_RANDOM;
spin_unlock(&file->f_lock);
break;
case POSIX_FADV_RANDOM:
spin_lock(&file->f_lock);
file->f_mode |= FMODE_RANDOM;
spin_unlock(&file->f_lock);
break;
case POSIX_FADV_SEQUENTIAL:
file->f_ra.ra_pages = bdi->ra_pages * 2;
spin_lock(&file->f_lock);
file->f_mode &= ~FMODE_RANDOM;
spin_unlock(&file->f_lock);
break;
case POSIX_FADV_WILLNEED:
/* First and last PARTIAL page! */
start_index = offset >> PAGE_SHIFT;
end_index = endbyte >> PAGE_SHIFT;
/* Careful about overflow on the "+1" */
nrpages = end_index - start_index + 1;
if (!nrpages)
nrpages = ~0UL;
/*
* Ignore return value because fadvise() shall return
* success even if filesystem can't retrieve a hint,
*/
force_page_cache_readahead(mapping, file, start_index, nrpages);
break;
case POSIX_FADV_NOREUSE:
break;
case POSIX_FADV_DONTNEED:
if (!inode_write_congested(mapping->host))
__filemap_fdatawrite_range(mapping, offset, endbyte,
WB_SYNC_NONE);
/*
* First and last FULL page! Partial pages are deliberately
* preserved on the expectation that it is better to preserve
* needed memory than to discard unneeded memory.
*/
start_index = (offset+(PAGE_SIZE-1)) >> PAGE_SHIFT;
end_index = (endbyte >> PAGE_SHIFT);
/*
* The page at end_index will be inclusively discarded according
* by invalidate_mapping_pages(), so subtracting 1 from
* end_index means we will skip the last page. But if endbyte
* is page aligned or is at the end of file, we should not skip
* that page - discarding the last page is safe enough.
*/
if ((endbyte & ~PAGE_MASK) != ~PAGE_MASK &&
endbyte != inode->i_size - 1) {
/* First page is tricky as 0 - 1 = -1, but pgoff_t
* is unsigned, so the end_index >= start_index
* check below would be true and we'll discard the whole
* file cache which is not what was asked.
*/
if (end_index == 0)
break;
end_index--;
}
if (end_index >= start_index) {
unsigned long count;
/*
* It's common to FADV_DONTNEED right after
* the read or write that instantiates the
* pages, in which case there will be some
* sitting on the local LRU cache. Try to
* avoid the expensive remote drain and the
* second cache tree walk below by flushing
* them out right away.
*/
lru_add_drain();
count = invalidate_mapping_pages(mapping,
start_index, end_index);
/*
* If fewer pages were invalidated than expected then
* it is possible that some of the pages were on
* a per-cpu pagevec for a remote CPU. Drain all
* pagevecs and try again.
*/
if (count < (end_index - start_index + 1)) {
lru_add_drain_all();
invalidate_mapping_pages(mapping, start_index,
end_index);
}
}
break;
default:
return -EINVAL;
}

Conclusion

Depending on the system, the implementation might vary slightly (if you're not using Linux) as it seems POSIX fadvise isn't absolutely clear about the rules around different flag combinations. But it seems possible that some properties are combined, while others aren't. Hopefully someone more experienced can elucidate.

Is there a reliable way to detect POSIXlt objects representing a time which does not exist due to DST?

The value of as.POSIXct(test) seems to be platform dependent, adding a layer of complexity to getting a reliable method. On my windows machine, (R 3.3.1), as.POSIXct(test) produces NA, as also reported by OP. However, on my Linux platform (same R version), I get the following:

times = c ("2015-03-29 01:00",
"2015-03-29 02:00",
"2015-03-29 03:00")

test <- strptime(times, format="%Y-%m-%d %H:%M", tz="CET")

test
#[1] "2015-03-29 01:00:00 CET" "2015-03-29 02:00:00 CEST" "2015-03-29 03:00:00 CEST"
as.POSIXct(test)
#[1] "2015-03-29 01:00:00 CET" "2015-03-29 01:00:00 CET" "2015-03-29 03:00:00 CEST"
as.character(test)
#[1] "2015-03-29 01:00:00" "2015-03-29 02:00:00" "2015-03-29 03:00:00"
as.character(as.POSIXct(test))
#[1] "2015-03-29 01:00:00" "2015-03-29 01:00:00" "2015-03-29 03:00:00"

The one thing that we can rely on is not the actual value of as.POSIXct(test), but that it will be different from test when test is an invalid date/time:

(as.character(test) == as.character(as.POSIXct(test))) %in% TRUE
# TRUE FALSE TRUE

I'm not sure that as.character is strictly necessary here, but I include it just to ensure that we don't fall foul of any other odd behaviours of POSIX objects.

Why does printf not flush after the call unless a newline is in the format string?

The stdout stream is line buffered by default, so will only display what's in the buffer after it reaches a newline (or when it's told to). You have a few options to print immediately:

  • Print to stderrinstead using fprintf (stderr is unbuffered by default):

    fprintf(stderr, "I will be printed immediately");
  • Flush stdout whenever you need it to using fflush:

    printf("Buffered, will be flushed");
    fflush(stdout); // Will now print everything in the stdout buffer
  • Disable buffering on stdout by using setbuf:

    setbuf(stdout, NULL);
  • Or use the more flexible setvbuf:

    setvbuf(stdout, NULL, _IONBF, 0); 

Match linebreaks - \n or \r\n?

I will answer in the opposite direction.


  1. For a full explanation about \r and \n I have to refer to this question, which is far more complete than I will post here: Difference between \n and \r?

Long story short, Linux uses \n for a new-line, Windows \r\n and old Macs \r. So there are multiple ways to write a newline. Your second tool (RegExr) does for example match on the single \r.

  1. [\r\n]+ as Ilya suggested will work, but will also match multiple consecutive new-lines. (\r\n|\r|\n) is more correct.


Related Topics



Leave a reply



Submit