Pull Random Line from Txt File as String

How do I read a random line from one file?

Not built-in, but algorithm R(3.4.2) (Waterman's "Reservoir Algorithm") from Knuth's "The Art of Computer Programming" is good (in a very simplified version):

import random

def random_line(afile):
line = next(afile)
for num, aline in enumerate(afile, 2):
if random.randrange(num):
continue
line = aline
return line

The num, ... in enumerate(..., 2) iterator produces the sequence 2, 3, 4... The randrange will therefore be 0 with a probability of 1.0/num -- and that's the probability with which we must replace the currently selected line (the special-case of sample size 1 of the referenced algorithm -- see Knuth's book for proof of correctness == and of course we're also in the case of a small-enough "reservoir" to fit in memory ;-))... and exactly the probability with which we do so.

Read random line from .txt file

If, for some reason, you can't just load the whole set of lines into memory (too big or whatever), there is a way to select a random entry from a streaming set of entries. It won't scale indefinitely, and it will exhibit small biases, but this is a game, not cryptography, so that shouldn't be a dealbreaker.

The logic is:

  1. Declare a buffer to hold the word
  2. Open the file
  3. For each line:

    • Increment a counter indicating which line you're on
    • Generate a random double (e.g. with drand48 or whatever PRNG facilities are available to you)
    • If 1.0 / lineno > randval, replace the currently stored word with the word from the current line (so the first line is auto stored, the second line is 50% likely to replace it, the third is 33% likely to do so, etc.)
  4. When you run out of lines, whatever is stored in word is your selection

Assuming the number of lines is small enough (and the range of doubles produced by your PRNG is fine-grained enough), this gives as close as possible to an equal likelihood of any given line being selected; for two lines, each has a 50/50 shot, for three, 33.33...%, etc.

I lack a C compiler right now, but the basic code would look like:

/* Returns a random line (w/o newline) from the file provided */
char* choose_random_word(const char *filename) {
FILE *f;
size_t lineno = 0;
size_t selectlen;
char selected[256]; /* Arbitrary, make it whatever size makes sense */
char current[256];
selected[0] = '\0'; /* Don't crash if file is empty */

f = fopen(filename, "r"); /* Add your own error checking */
while (fgets(current, sizeof(current), f)) {
if (drand48() < 1.0 / ++lineno) {
strcpy(selected, current);
}
}
fclose(f);
selectlen = strlen(selected);
if (selectlen > 0 && selected[selectlen-1] == '\n') {
selected[selectlen-1] = '\0';
}
return strdup(selected);
}

How to Read and Return Random Line From Txt(C)

There are a number of bugs and some improvements to be made.

Here's an annotated version of your code showing bugs and potential changes and fixes:

#include <stdio.h>
#include <time.h>
#include <stdlib.h>

char *
word(char *file, char *str0)
{
int end, loop, line;
#if 0
int i;
#endif

// NOTE/BUG: don't cast the return of malloc
#if 0
str0 = (char *) malloc(20);
#else
str0 = malloc(20);
#endif

FILE *fd = fopen(file, "r");

// NOTE/BUG: opening and closing the file on _each_ call is wasteful -- having
// main open the file and passing the file descriptor is faster
if (fd == NULL) {
printf("Failed to open file\n");
return (NULL);
}

// NOTE/BUG: put this in main
#if 0
srand(time(NULL));
#endif
line = rand() % 100 + 1;

for (end = loop = 0; loop < line; ++loop) {
if (NULL == fgets(str0, 20, fd)) {
end = 1;
break;
}
}

#if 1
fclose(fd);
#endif

if (!end) {
// NOTE/BUG: the fclose should _always_ be done (even on EOF) -- put it above
#if 0
fclose(fd);
#endif

return str0;
// NOTE/BUG: this will _never_ be executed -- so the return will leak this
// memory -- put free in main
#if 0
free(str0);
#endif
}

// NOTE/BUG: on EOF, we fall through to here -- we have no return statement
// for this case
#if 1
return str0;
#endif
}

int
main(void)
{
char *str;
char *str_2;

#if 1
srand(time(NULL));
#endif
// NOTE/BUG: passing the 2nd argument does nothing because word will toss any
// value
// NOTE/BUG: str and str_2 are _uninitialized
printf("%s", word("words.txt", str));
printf("%s", word("words.txt", str_2));

// NOTE/BUG: no return for main
#if 1
return 0;
#endif
}

Here's a cleaned up and working version:

#include <stdio.h>
#include <time.h>
#include <stdlib.h>

char *
word(FILE *fd)
{
char *str0;
int end, loop, line;
int len;

len = 20;
str0 = malloc(len);

line = rand() % 100 + 1;

rewind(fd);

end = 0;
for (loop = 0; loop < line; ++loop) {
if (NULL == fgets(str0, len, fd)) {
end = 1;
break;
}
}

if (end) {
free(str0);
str0 = NULL;
}

return str0;
}

int
main(void)
{
char *str;

srand(time(NULL));

char *file = "words.txt";
FILE *fd = fopen(file, "r");

if (fd == NULL) {
printf("Failed to open file\n");
return 1;
}

for (int i = 1; i <= 20; ++i) {
str = word(fd);
if (str == NULL)
continue;

printf("%d: %s", i, str);

free(str);
}

fclose(fd);

return 0;
}

Doing malloc in word can be wasteful. It's not absolutely wrong, if you intend for caller to save all the strings in an array.

Many times, for a function such as word, caller can pass down the buffer pointer and its maximum length as arguments. This makes word be more similar to fgets

Here's a version that does that, to illustrate an alternate approach:

#include <stdio.h>
#include <time.h>
#include <stdlib.h>

int
word(FILE *fd,char *str0,int len)
{
int found, loop, line;

line = rand() % 100 + 1;

rewind(fd);

found = 1;
for (loop = 0; loop < line; ++loop) {
if (NULL == fgets(str0, len, fd)) {
found = 0;
break;
}
}

return found;
}

int
main(void)
{
char *str;
int len = 20;

srand(time(NULL));

char *file = "words.txt";
FILE *fd = fopen(file, "r");

if (fd == NULL) {
printf("Failed to open file\n");
return 1;
}

str = malloc(len);

for (int i = 1; i <= 20; ++i) {
if (! word(fd,str,len))
continue;

printf("%d: %s", i, str);
}

fclose(fd);
free(str);

return 0;
}

Read text file line by line and select random line javascript

if you want read a text file from the path you need run your code such as node.js environment; but if you using HTML and browser to pick file from input, continue .

add some html element to pick file and generate text from picked file :

<input type="file" onchange="FileReader(this.files[0])" />
<button onclick="RandomText(TEXT)">Generate</button>

and now add this js script :

var TEXT = "";
async function FileReader(file) {
TEXT = await file.text();
}
function RandomText(text) {
const textArray = text.split("\n");
const randomKey = Math.floor(Math.random() * textArray.length);
console.log(textArray[randomKey]);
}

How to select a random line from a .txt file in Kotlin

As mentioned in the comments, use readLines() in combination with random():

File("file.txt").readLines().random()

But, as the documentation of readLines() says:

Do not use this function for huge files.

Select random line in txt file with Python

I think what you want to use is .readlines(). This could work like this:

line = random.randint(1, max_line)

with open("external_file.txt", "r") as file:
print(file.readlines()[line]

Replace "external_file.txt" with your actual filename.

Although, if you want to include the first line of the file as well, you might want to change line = random.randint(1, max_line)to line = random.randint(0, max_line)

Pull random line from TXT file as string

Make sure your file has read permissions set, should be CHMOD'd to 644 or 744.

How do I get a random line from a file?

Use IteratorRandom::choose to randomly sample from an iterator using reservoir sampling. This will scan through the entire file once, creating Strings for each line, but it will not create a giant vector for every line:

use rand::seq::IteratorRandom; // 0.7.3
use std::{
fs::File,
io::{BufRead, BufReader},
};

const FILENAME: &str = "/etc/hosts";

fn find_word() -> String {
let f = File::open(FILENAME)
.unwrap_or_else(|e| panic!("(;_;) file not found: {}: {}", FILENAME, e));
let f = BufReader::new(f);

let lines = f.lines().map(|l| l.expect("Couldn't read line"));

lines
.choose(&mut rand::thread_rng())
.expect("File had no lines")
}

Your original problem is that:

  1. slice::get returns an optional reference into the vector.

    You can either clone this or take ownership of the value:

    let line = lines[n].cloned()
    let line = lines.swap_remove(n)

    Both of these panic if n is out-of-bounds, which is reasonable here as you know that you are in bounds.

  2. BufRead::lines returns io::Result<String>, so you have to handle that error case.

Additionally, don't use format! with expect:

expect(&format!("..."))

This will unconditionally allocate memory. When there's no failure, that allocation is wasted. Use unwrap_or_else as shown.



Related Topics



Leave a reply



Submit