How to get the latest folder that contains a specific file of interest in Linux and download that file using Paramiko in Python?
Executing scp
command on the remote machine to push the file back to the local machine is an overkill. And in general relying on shell commands is very fragile approach. You better use native Python code only, to identify the latest remote file and pull it to your local machine. Your code will be way more robust and readable.
sftp = ssh.open_sftp()
sftp.chdir('/mydir')
files = sftp.listdir_attr()
dirs = [f for f in files if S_ISDIR(f.st_mode)]
dirs.sort(key = lambda d: d.st_mtime, reverse = True)
filename = 'email_summary.log'
for d in dirs:
print('Checking ' + d.filename)
try:
path = d.filename + '/' + filename
sftp.stat(path)
print('File exists, downloading...')
sftp.get(path, filename)
break
except IOError:
print('File does not exist, will try the next folder')
The above is based on:
- How to download only the latest file from SFTP server with Paramiko?
- Paramiko get sorted directory listing
Side note: Do not use AutoAddPolicy
. You lose security by doing so. See Paramiko "Unknown Server".
How to download only the latest file from SFTP server with Paramiko?
Use the SFTPClient.listdir_attr
instead of the SFTPClient.listdir
to get listing with attributes (including the file timestamp).
Then, find a file entry with the greatest .st_mtime
attribute.
The code would be like:
latest = 0
latestfile = None
for fileattr in sftp.listdir_attr():
if fileattr.filename.startswith('Temat') and fileattr.st_mtime > latest:
latest = fileattr.st_mtime
latestfile = fileattr.filename
if latestfile is not None:
sftp.get(latestfile, latestfile)
For a more complex example, see How to get the latest folder that contains a specific file of interest in Linux and download that file using Paramiko in Python?
How to sort file list pulled from SFTP server using Paramiko by modification date?
Retrieve the listing with file attributes (including the modification time) using SFTPClient.listdir_attr
. And then sort the list by SFTPAttributes.st_mtime
field.
filesInSFTP = sftp.listdir_attr(sftpPullDirectory)
filesInSFTP.sort(key = lambda f: f.st_mtime)
Related questions:
- How to get the latest folder that contains a specific file of interest in Linux and download that file using Paramiko in Python?
- Paramiko get sorted directory listing
Obligatory warning: Do not use AutoAddPolicy
– You are losing a protection against MITM attacks by doing so. For a correct solution, see Paramiko "Unknown Server".
Execute command of remote host with Paramiko and download a file that the command creates once it completes
You are not waiting for the command to finish. So you are downloading an incomplete file.
To wait for the command to finish, you can do the following:
stdin, stdout, stderr = ssh_client.exec_command(cmd_stmt)
stdout.channel.set_combine_stderr(True)
output = stdout.readlines()
ftp_client = ssh_client.open_sftp()
For more, see Wait to finish command executed with Python Paramiko.
Obligatory warning: Do not use AutoAddPolicy
– You are losing a protection against MITM attacks by doing so. For a correct solution, see Paramiko "Unknown Server".
Paramiko get sorted directory listing
There's no way to make SFTPClient.listdir_attr
return a sorted list.
Sorting is easy though:
files = sftp.listdir_attr()
files.sort(key = lambda f: f.filename)
Or for example, if you want to sort only files by size from the largest to the smallest:
from stat import S_ISDIR, S_ISREG
files = [f for f in files if not S_ISDIR(f.st_mode)]
files.sort(key = lambda f: f.st_size, reverse = True)
iterating through a large 20+ gb file from a server with python
You're... in for pain. I recommend you follow the rsync route and write a script that runs on the server which serves up the bytes you're interested in. You can communicate with it via a text channel created by paramiko.
Read file from remote server completely to local machine in python SSHCLient?
So here is what worked for me.
- since the file was too big to read via ssh client from remote server and I was only looking for file completion indication, which i can see only at EOF.
- my startegy was to connect to remote server and run 'tac' linux command- which reverse the file and then i accessed the reversed file and found the EOF results in beginning of file and i was able to confirm the Validation of file
code:
*SSH client connect to remote server
ssh.exec("tac" + filename.txt + " >> filereversed.txt" )
*read the filereversed.txt from remote server and validate
Is it possible to transfer files from a directory using SCP in Python but ignore hidden files or sym links?
There's no API in SCPClient
to skip hidden files or symbolic links.
For upload, it's easy, if you copy the SCPClient
's code and modify it as you need. See the os.walk
loop in _send_recursive
function.
If you do not want to modify the SCPClient
's code, you will have to iterate the files on your own, calling SCPClient.put
for each. It will be somewhat less efficient, as it will start new SCP server for each file.
For download, you might be able to modify the SCPClient
code to respond with non-zero code to C
commands fed by the server for the files you do not want to download.
Check the _recv_file
function. There where name
is resolved, check for names or attributes of files you are not interested in downloading and do chan.send('\x01')
and exit the function.
Though why do you want to use SCP? Use SFTP. It is much better suited for custom rules you need.
Paramiko does not have recursive SFTP transfer functionality (But pysftp does, see pysftp vs. Paramiko). But you won't be able to use it anyway, for the same reason you cannot use it with SCP. For your specific needs.
But check my answer to Python pysftp get_r from Linux works fine on Linux but not on Windows. It shows a simple recursive SFTP download code. Just modify it slightly to skip the files you do not want to download.
Something like
if (not S_ISLNK(mode)) and (not entry.filename.startswith(".")):
(see Checking if a file on SFTP server is a symbolic link, and deleting the symbolic link, using Python Paramiko/pysftp)
Related Topics
Mismatch Between Sys.Executable and Sys.Version in Python
How to Programmatically Edit Excel Sheets
How to Setup Environment Variable R_User to Use Rpy2 in Python
What Are Python Pandas Equivalents for R Functions Like Str(), Summary(), and Head()
Why Xgrabkey Generates Extra Focus-Out and Focus-In Events
How to Obtain Ports That a Process in Listening On
Pip Install Unable to Find Ffi.H Even Though It Recognizes Libffi
How to List All Python Virtual Environments in Linux
Automatic Detection of Display Availability with Matplotlib
Cannot Bind Numpad Minus Key on Linux with Tkinter
How to Make the Python Program to Check Linux Services
How to Cleanly Kill Subprocesses in Python
How to Convert Seconds to Hours, Minutes and Seconds
Multiprocessing.Pool Spawning New Childern After Terminate() on Linux/Python2.7