Making Python Script Exit Completely with One Ctrl+C

Dear Python community,

I hope this message finds you in good health. I am reaching out to seek your assistance regarding an issue that I have been experiencing with one of my Python scripts that involves the use of libcurl.

The script, named proxy-speed.py, is employed to measure and display the speed of different proxies. It runs through a list of proxies, giving the current and average speed for each, one at a time. However, the issue arises when I try to terminate the script prematurely using Ctrl+C. The KeyboardInterrupt appears to only pause the current task, and the script instantaneously proceeds with the next proxy on the list, clearly illustrating that it doesn’t exit entirely.

Here’s an example illustrating the issue:

  File "/home/werner/Desktop/proxy-speed/proxy-speed.py", line 54, in progress
    def progress(download_t, download_d, upload_t, upload_d):

KeyboardInterrupt
Average Speed: 3944.97 kB/s

    Proxy: SG_ssr_futeapbquf5m.nodelist.club_1356_8ce27b1301fcbcb77cc5abb28cb127a6
    ^CTraceback (most recent call last):
      File "/home/werner/Desktop/proxy-speed/proxy-speed.py", line 54, in progress
        def progress(download_t, download_d, upload_t, upload_d):

    KeyboardInterrupt
    Average Speed: 72.81 kB/s

This repeat pattern continues for every Ctrl+C interruption made. Hence, my goal is to modify the script so that a single Ctrl+C command would result in a comprehensive termination of the script.

Any guidance or suggestions that you could offer regarding this issue would be greatly appreciated. If there is any further information needed to better understand the issue, please let me know.

The content of proxy-speed.py is as follows:

import subprocess
import time
import pycurl
from io import BytesIO 

def fetch_proxies():
    command = 'echo "show stat" | sudo socat stdio /var/run/haproxy.sock 2>/dev/null | awk -F, \'$1=="socks5" && !($2~/^(FRONTEND|BACKEND)$/) {print $2,$74}\''
    process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
    output, error = process.communicate()

    result_dict = {}  # Initialize default empty dictionary

    if error is not None:
        return result_dict  # Return the empty dictionary in case of error

    lines = output.decode('utf-8').split('\n')

    for line in lines:
        if line != "":
            key_value = line.split(' ')
            result_dict[key_value[0]] = key_value[1]

    return result_dict


def test_proxy(proxy, url):
    global last_calc_time, download_start_time

    buffer = BytesIO()

    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.WRITEDATA, buffer)
    c.setopt(pycurl.PROXY, proxy)
    c.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
    c.setopt(pycurl.NOPROGRESS, False)
    c.setopt(pycurl.XFERINFOFUNCTION, progress)
    c.setopt(pycurl.TIMEOUT, 5)

    download_start_time = time.time()
    last_calc_time = download_start_time

    try:
        c.perform()
    except pycurl.error as e:
        pass
    except KeyboardInterrupt:
        print("Script interrupted by user. Exiting.")
        sys.exit(0)  # immediately exit the script

    average_speed = c.getinfo(pycurl.SPEED_DOWNLOAD) / 1024
    return average_speed

def progress(download_t, download_d, upload_t, upload_d):
    global last_calc_time, download_start_time
    current_time = time.time()

    if current_time - last_calc_time >= 2:
        elapsed_time = current_time - download_start_time
        current_speed = download_d / elapsed_time / 1024
        print(f"Current Speed: {current_speed:.2f} kB/s")
        last_calc_time = current_time

proxy_data = fetch_proxies()
url = "http://ipv4.download.thinkbroadband.com/1GB.zip"

for key, value in proxy_data.items():
    print(f"Proxy: {key}")
    try:
        average_speed = test_proxy(value, url)
        print(f"Average Speed: {average_speed:.2f} kB/s")
    except KeyboardInterrupt:
        print("Script interrupted by user. Exiting.")
        break  # exit the for loop


Thank you very much for your help in advance.

Best Regards,
Zhao

P.S: If possible, could you please provide both an explanation and snippet of the solution so that I can understand and learn from it for future references.

Just a suggestion: have you considered using an in-app solution, for reading the a ‘key-press’, rather than relying on Ctrl+C, and maybe have your app terminate with, say Esc, or any key that you feel is appropriate?

I’ve written a couple of apps this way and use the Blessed library for this kind of functionality.

I still don’t know how to implement it specifically based on my example.

I’ve never tried this with a subprocess (I don’t use them) and it would take me more time than I have to spare, to implement this for you, but here’s a small demo:

import sys
from blessed import Terminal
from time import sleep


def df1():
    # this can be any function
    print("Call to df1")
    # this is just a dummy routine, just so we have something to do
    for x in range(10):
        print(x)
        sleep(1)


def main():
    # any options you choose. the first three are just for show
    options = ("0", "1", "2", ESCAPE_KEY)  # ESCAPE_KEY to exit
    inp = ""
    with TERM.cbreak():
        while inp not in options:
            inp = TERM.inkey(timeout=0.05)
            if inp == ESCAPE_KEY:
                sys.exit("Esc key detected.")
            else:
                # do whatever
                df1()

ESCAPE_KEY = "\x1b"
TERM = Terminal()
main()

If you run this, you can press the Esc key at any point, and the app will terminate as soon as it’s finished its last call; in this demo, that will always be df1()

Maybe this will help, maybe not, but here it is, for what it’s worth.

You catch KeyboardInterrupt in two places.
One is in test_proxy that is repeatedly called.
I cannot recall what the sys.exit(0) will do in this context.

Suggest you stop catching in test_proxy and I think the code will do what you want.

FYI since you are using python you can simplify your subprocess call to
send “show stat” as the process input and just run ['sudo', 'socat', 'stdio', '/var/run/haproxy.sock']. The 2>/dev/null is telling subprocess to set stderr=NULL.
Once you have the output you can do the logic of the awk directly in python.

Having two places where you try to catch the KeyboardInterrupt is redundant, but I don’t see why this would be problematic all by itself (either it’s caught in one place or the other and in both cases the program should exit afterwards). The sys.exit call (if called) will raise SystemExit (like KeyboardInterrupt not an Exception, but a BaseException), and this is not caught in top-level code, so the script should exit.

I think you can see that in principle the code is OK when you run a modified/simplified version.
For instance the following

import sys
import pycurl
from io import BytesIO

url = "https://discuss.python.org/t/making-python-script-exit-completely-with-one-ctrl-c/40529/7"

def test():
    c = pycurl.Curl()
    buffer = BytesIO()
    
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.WRITEDATA, buffer)
    c.setopt(pycurl.TIMEOUT, 1)

    try:
        c.perform()            
        time.sleep(1)
    except pycurl.error as e:
        pass
    except KeyboardInterrupt:
        print("KeyboardInterrupt in test")
        sys.exit(1)

    average_speed = c.getinfo(pycurl.SPEED_DOWNLOAD) / 1024
    return average_speed

for _ in range(5):
    try:
        print(test())
    except KeyboardInterrupt:
        print("KeyboardInterrupt (global)")
        break

behaves as expected (at least on my host - MacOS M2/Sonoma): Ctrl+C always leads to “Keyboard interrupt in test” with exit 1. But KeyboardInterrupt is a wily beast. It’s not a normal Exception (only a BaseException) and the exact time when it will be caught might be a bit random. In this case, I also wonder how this interacts with pycurl (calling C code plus also in your case using a Python callback).

You could also consider removing the except KeyboardInterrupt clauses and setting up a signal handler.
Perhaps this might work better here?

import signal

def sigint_handler(signal, frame):
    print(f"KeyboardInterrupt at line {frame.f_lineno}")
    sys.exit(0)

signal.signal(signal.SIGINT, sigint_handler)

The callback into progress is what is causing the problem. When I add the progress function to the simplified code (see earlier), the problem becomes reproducible for me. The interrupt inside test is then printed out but doesn’t lead to system exit! I’m not 100% sure I understand why not, but I assume it’s related to the fact that a Python function is here called back from (non-Python) C code.

It turns out that checking for KeyboardInterrupt inside the callback and then calling system.exit does not work. I think the simple reason is that this Python callback may not see some low-level signals like KeyboardInterrupt – they may be handled at that point by the C-code, leading the callback to be aborted (or leading to some specific other error). Here, in the test loop, on pycurl exceptions, if you get a pycurl.error “(42, 'Callback aborted)”, you can call system.exit (which then will work, since you’re in the Python main thread).

So, the following should work:

def progress(download_t, download_d, upload_t, upload_d):
    global last_calc_time, download_start_time
    current_time = time.time()

    # try:
    if current_time - last_calc_time >= 2:
            elapsed_time = current_time - download_start_time
            current_speed = download_d / elapsed_time / 1024
            print(f"Current Speed: {current_speed:.2f} kB/s")
            last_calc_time = current_time
    # except KeyboardInterrupt:  
        # I tried this out, 
        # but see "Update" comment below - I don't think this is ever hit
        # return -1    
    return 0

And (note that I added a finally block to make sure the connection is cleaned up since I don’t really know what happens on the C-level):

def test_proxy(proxy, url):
    global last_calc_time, download_start_time
    buffer = BytesIO()

    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.WRITEDATA, buffer)
    c.setopt(pycurl.PROXY, proxy)
    c.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
    c.setopt(pycurl.NOPROGRESS, False)
    c.setopt(pycurl.XFERINFOFUNCTION, progress)
    c.setopt(pycurl.TIMEOUT, 5)

    download_start_time = time.time()
    last_calc_time = download_start_time

    average_speed = 0.0

    try:
        c.perform()
        average_speed = c.getinfo(pycurl.SPEED_DOWNLOAD) / 1024
    except pycurl.error as e:
        if e.args == (42, "Callback aborted"):
            # you may want to look up if this can also be caused by other signals
            # if the callback returns 1 instead of 0 or None, then you'll also get here
            sys.exit(3)
        else:
            print("Ignoring pycurl exception", e)
    except KeyboardInterrupt:  
        # this block is effectively dead code, extremely unlikely to ever be hit
        print("Script interrupted by user. Exiting.")
        sys.exit(0)  # immediately exit the script
    finally:
        c.close()

     return average_speed

A much simpler fix is to use the simple sigint handler (see earlier post too); this works as expected (on my machine) also when the progress function is used.

Note that the original code

     except pycurl.error as e:
         pass

is just always bad style, asking for either trouble or confusion (or both :))

Update (1): It seems the try/except block in progress is not needed (and you can always return 0 or None as in the original code). In fact, the Python KeyboardInterrupt there also never seems to be hit. So, it seems to be necessary and enough to handle the pycurl.error. The reason why the KeyboardInterrupt is not hit inside the proxy_test function seems to be that this error is already consumed by/converted into a pycurl.error (the “KeyboardInterrupt” logging comes from pycurl and is evidence for this). So, it’s sufficient to handle the specific “Callback aborted” error there.

So, as far as I understand it now, it turns out to be impossible to handle KeyboardInterrupt inside the progress callback; the interrupt seems to always be intercepted by pycurl itself and is never propagated to Python. I think that could be a considered a bug in pycurl.

Update (2): The use of socks proxies and signal handlers muddies the waters a bit too. Apparently there are some issues with that in pycurl. See: Curl.perform() blocks SIGINT during the start of a SOCKS transfer · Issue #706 · pycurl/pycurl · GitHub
But the original issue is also reproducible when not using proxies, so this may be a red herring here.

TL:TR

  • Never use an except block that only consists of pass
  • Python callbacks called from non-Python code may not completely behave as normal functions;
    some low-level exceptions (signals) may be handled by the C-code and Python may not see them

Thank you for your advice, I changed to the following version:

def fetch_proxies():
    # Starting a subprocess with specified arguments
    process = subprocess.run(['sudo', 'socat', 'stdio', '/var/run/haproxy.sock'], 
                            input="show stat\n".encode(), 
                            stdout=subprocess.PIPE, 
                            stderr=subprocess.PIPE, 
                            check=True)

    # Getting the output
    output = process.stdout

    # Initializing an empty dictionary to store the results
    result_dict = {}

    # Decoding and splitting the output into lines
    lines = output.decode().split('\n')

    # Parsing each line
    for line in lines:
        parts = line.split(',')
        if len(parts) > 1 and parts[0] == 'socks5' and parts[1] not in ['FRONTEND', 'BACKEND']:
            # Assuming the desired information is at index 73
            result_dict[parts[1]] = parts[73]

    return result_dict

Thank you very much for your demonstration code snippet. I finally came up with the following working method:

$ cat working-version-1.py
import time
import sys
import pycurl
from io import BytesIO 
from blessed import Terminal
import subprocess


last_calc_time = None
download_start_time = None
interrupted = False


#def fetch_proxies():
#    command = 'echo "show stat" | sudo socat stdio /var/run/haproxy.sock 2>/dev/null | awk -F, \'$1=="socks5" && !($2~/^(FRONTEND|BACKEND)$/) {print $2,$74}\''
#    process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
#    output, error = process.communicate()

#    result_dict = {}  # Initialize default empty dictionary

#    if error is not None:
#        return result_dict  # Return the empty dictionary in case of error

#    lines = output.decode('utf-8').split('\n')

#    for line in lines:
#        if line != "":
#            key_value = line.split(' ')
#            result_dict[key_value[0]] = key_value[1]

#    return result_dict
    
def fetch_proxies():
    # Starting a subprocess with specified arguments
    process = subprocess.run(['sudo', 'socat', 'stdio', '/var/run/haproxy.sock'], 
                            input="show stat\n".encode(), 
                            stdout=subprocess.PIPE, 
                            stderr=subprocess.PIPE, 
                            check=True)

    # Getting the output
    output = process.stdout

    # Initializing an empty dictionary to store the results
    result_dict = {}

    # Decoding and splitting the output into lines
    lines = output.decode().split('\n')

    # Parsing each line
    for line in lines:
        parts = line.split(',')
        if len(parts) > 1 and parts[0] == 'socks5' and parts[1] not in ['FRONTEND', 'BACKEND']:
            # Assuming the desired information is at index 73
            result_dict[parts[1]] = parts[73]

    return result_dict
    
    
def test_proxy(proxy, url):
    global last_calc_time, download_start_time, interrupted

    buffer = BytesIO()

    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.WRITEDATA, buffer)
    c.setopt(pycurl.PROXY, proxy)
    c.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
    c.setopt(pycurl.NOPROGRESS, False)
    c.setopt(pycurl.XFERINFOFUNCTION, progress)
    c.setopt(pycurl.TIMEOUT, 5)

    download_start_time = time.time()
    last_calc_time = download_start_time

    try:
        c.perform()
    except pycurl.error as e:
        pass
    except KeyboardInterrupt:
        print("Script interrupted by user.")
        interrupted = True  # set the flag to True when interrupted

    average_speed = c.getinfo(pycurl.SPEED_DOWNLOAD) / 1024
    return average_speed  
    
def progress(download_t, download_d, upload_t, upload_d):
    global last_calc_time, download_start_time
    current_time = time.time()

    if current_time - last_calc_time >= 2:
        elapsed_time = current_time - download_start_time
        current_speed = download_d / elapsed_time / 1024
        print(f"Current Speed: {current_speed:.2f} kB/s")
        last_calc_time = current_time    


def listener(terminal):
    with terminal.cbreak():
        key = terminal.inkey(timeout=0.05)
        if key == '\x1b':
            sys.exit("Esc key detected!")

proxy_data = fetch_proxies()
url = "http://ipv4.download.thinkbroadband.com/1GB.zip"
term = Terminal()

for key, value in proxy_data.items():
    if interrupted:
        sys.exit("Exiting due to interrupt.")

    print(f"Proxy: {key}")
    listener(term)  # Listen for escape key press after each proxy test

    try:
        average_speed = test_proxy(value, url)
        print(f"Average Speed: {average_speed:.2f} kB/s")
    except KeyboardInterrupt:
        sys.exit("Exiting due to interrupt.")

The test results are as follows:

(datasci) werner@X10DAi:~/Desktop/proxy-speed$ python working-version-1.py 
Proxy: SG_ssr_futeapbquf5m.nodelist.club_1453_018d83c677e05e71f172014fc3f45e39
Current Speed: 671.90 kB/s
^[Current Speed: 7220.37 kB/s
Average Speed: 8817.69 kB/s
Proxy: HK_ssr_wo8o8npg4fny.nodelist.club_1303_b5bf85111d0f51c517ec7302d3f33ce1
Esc key detected!
1 Like

I tried as follows, but it cannot solve the problem:

import subprocess
import time
import pycurl
from io import BytesIO 

def fetch_proxies():
    command = 'echo "show stat" | sudo socat stdio /var/run/haproxy.sock 2>/dev/null | awk -F, \'$1=="socks5" && !($2~/^(FRONTEND|BACKEND)$/) {print $2,$74}\''
    process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
    output, error = process.communicate()

    result_dict = {}  # initialization the empty dict

    if error is not None:
        return result_dict  # return empty dict if error

    lines = output.decode('utf-8').split('\n')

    for line in lines:
        if line != "":
            key_value = line.split(' ')
            result_dict[key_value[0]] = key_value[1]

    return result_dict


def test_proxy(proxy, url):
    global last_calc_time, download_start_time

    buffer = BytesIO()

    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.WRITEDATA, buffer)
    c.setopt(pycurl.PROXY, proxy)
    c.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
    c.setopt(pycurl.NOPROGRESS, False)
    c.setopt(pycurl.XFERINFOFUNCTION, progress)
    c.setopt(pycurl.TIMEOUT, 5)

    download_start_time = time.time()
    last_calc_time = download_start_time

    try:
        c.perform()
    except pycurl.error as e:
        pass

    average_speed = c.getinfo(pycurl.SPEED_DOWNLOAD) / 1024
    return average_speed

def progress(download_t, download_d, upload_t, upload_d):
    global last_calc_time, download_start_time
    current_time = time.time()

    if current_time - last_calc_time >= 2:
        elapsed_time = current_time - download_start_time
        current_speed = download_d / elapsed_time / 1024
        print(f"current_speed: {current_speed:.2f} kB/s")
        last_calc_time = current_time

proxy_data = fetch_proxies()
url = "http://ipv4.download.thinkbroadband.com/1GB.zip"

for key, value in proxy_data.items():
    print(f"proxy: {key}")
    try:
        average_speed = test_proxy(value, url)
        print(f"average_speed: {average_speed:.2f} kB/s")
    except KeyboardInterrupt:
        print("user interrupt, exit.")
        break

See the test results below:

(datasci) werner@X10DAi:~/Desktop/proxy-speed$ python temp-test2.py 
proxy: SG_ssr_futeapbquf5m.nodelist.club_1453_018d83c677e05e71f172014fc3f45e39
current_speed: 2685.67 kB/s
current_speed: 7676.65 kB/s
average_speed: 9002.48 kB/s
proxy: HK_ssr_wo8o8npg4fny.nodelist.club_1303_b5bf85111d0f51c517ec7302d3f33ce1
current_speed: 1788.71 kB/s
current_speed: 7130.96 kB/s
^CTraceback (most recent call last):
  File "/home/werner/Desktop/proxy-speed/temp-test2.py", line 51, in progress
    def progress(download_t, download_d, upload_t, upload_d):
    
KeyboardInterrupt
average_speed: 7393.14 kB/s
proxy: SG_ssr_futeapbquf5m.nodelist.club_1354_ddfba110eddfdb7037f389e9b5917477
current_speed: 2301.29 kB/s
^CTraceback (most recent call last):
  File "/home/werner/Desktop/proxy-speed/temp-test2.py", line 51, in progress
    def progress(download_t, download_d, upload_t, upload_d):
    
KeyboardInterrupt
average_speed: 7925.51 kB/s
proxy: HK_ssr_fwqvvo60u1mj.nodelist.club_1423_dba3b0996744e7b82eae7634626a5ba9
current_speed: 620.63 kB/s
current_speed: 4057.07 kB/s
average_speed: 5692.93 kB/s

Thank you so much for your wonderful tips and incisive analysis. Based on your valuable comments and suggestions, I finally worked out the following solution:

$ cat working-version-2.py 
#https://discuss.python.org/t/making-python-script-exit-completely-with-one-ctrl-c/40529/12?u=hongyi-zhao
import subprocess
import time
import pycurl
from io import BytesIO
import sys
import signal

def sigint_handler(signal, frame):
    print(f"KeyboardInterrupt at line {frame.f_lineno}")
    sys.exit(0)

signal.signal(signal.SIGINT, sigint_handler)

def fetch_proxies():
    command = 'echo "show stat" | sudo socat stdio /var/run/haproxy.sock 2>/dev/null | awk -F, \'$1=="socks5" && !($2~/^(FRONTEND|BACKEND)$/) {print $2,$74}\''
    try:
        process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
        output, error = process.communicate()
    except Exception as e:
        print("Error occurred while executing shell command: ", e)
        return {}

    result_dict = {}

    if error is not None:
        print("Error occurred while getting proxies status: ", error.decode('utf-8'))

    lines = output.decode('utf-8').split('\n')

    for line in lines:
        if line and len(line.split(' ')) == 2:
            key_value = line.split(' ')
            result_dict[key_value[0]] = key_value[1]

    return result_dict

def test_proxy(proxy, url):
    global last_calc_time, download_start_time, total_downloaded_data
    buffer = BytesIO()

    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.WRITEDATA, buffer)
    c.setopt(pycurl.PROXY, proxy)
    c.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
    c.setopt(pycurl.NOPROGRESS, False)
    c.setopt(pycurl.XFERINFOFUNCTION, progress)
    c.setopt(pycurl.TIMEOUT, 5)

    download_start_time = time.time()
    last_calc_time = download_start_time
    total_downloaded_data = 0

    try:
        c.perform()
    except pycurl.error as e:
        #print("PycURL error occurred during proxy testing: ", e)
        pass
    finally:
        c.close()

    elapsed_time = time.time() - download_start_time
    average_speed = total_downloaded_data / elapsed_time / 1024
    return average_speed
    

def progress(download_t, download_d, upload_t, upload_d):
    global last_calc_time, download_start_time, total_downloaded_data
    current_time = time.time()

    if current_time - last_calc_time >= 2:
        elapsed_time = current_time - download_start_time  # Time since download start
        current_speed = download_d / elapsed_time / 1024   # Speed since download start
        print(f"Current Speed: {current_speed:.2f} kB/s")
        last_calc_time = current_time
        total_downloaded_data = download_d

    return 0
    
    
proxy_data = fetch_proxies()

if not proxy_data:
    print("No proxy data found.")
    sys.exit(1)

url = "http://ipv4.download.thinkbroadband.com/1GB.zip"

for key, value in proxy_data.items():
    print(f"Testing Proxy: {key}")
    average_speed = test_proxy(value, url)
    print(f"Average Speed: {average_speed:.2f} kB/s")

See the test results below:

(datasci) werner@X10DAi:~/Desktop/proxy-speed$ python working-version-2.py 
Testing Proxy: SG_ssr_futeapbquf5m.nodelist.club_1453_018d83c677e05e71f172014fc3f45e39
Current Speed: 3482.44 kB/s
Current Speed: 9887.41 kB/s
Average Speed: 7911.91 kB/s
Testing Proxy: HK_ssr_wo8o8npg4fny.nodelist.club_1303_b5bf85111d0f51c517ec7302d3f33ce1
Current Speed: 859.10 kB/s
Current Speed: 5546.01 kB/s
Average Speed: 4554.96 kB/s
Testing Proxy: SG_ssr_futeapbquf5m.nodelist.club_1354_ddfba110eddfdb7037f389e9b5917477
Current Speed: 3412.92 kB/s
^CKeyboardInterrupt at line 68

Any further comments and improvements would be greatly appreciated.

Regards,
Zhao

On the other hand, I also came up with the following solutions:

# This is the method 1:
$ cat rev-1.py 
import subprocess
import time
import pycurl
from io import BytesIO
import sys
from threading import Thread

def fetch_proxies():
    command = 'echo "show stat" | sudo socat stdio /var/run/haproxy.sock 2>/dev/null | awk -F, \'$1=="socks5" && !($2~/^(FRONTEND|BACKEND)$/) {print $2,$74}\''
    process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
    output, error = process.communicate()

    result_dict = {}

    if error is not None:
        return result_dict

    lines = output.decode('utf-8').split('\n')

    for line in lines:
        if line != "":
            key_value = line.split(' ')
            result_dict[key_value[0]] = key_value[1]

    return result_dict


class TestProxyThread(Thread):
    def __init__(self, curl_obj, download_buffer):
        super(TestProxyThread, self).__init__()
        self.curl_obj = curl_obj
        self.download_buffer = download_buffer
        self.start_time = time.time()
        self.end_time = None  

    def run(self):
        try:
            self.curl_obj.perform()
        except pycurl.error as e:  
            pass
        finally:
            self.end_time = time.time()  

        self.download_buffer.seek(0, 2)

    def get_total_bytes_downloaded(self):
        return self.download_buffer.tell()  

    def get_time_taken(self):
        if self.end_time:
            return self.end_time - self.start_time
        else:
            return time.time() - self.start_time  

def progress(download_t, download_d, upload_t, upload_d):
    global last_calc_time, download_start_time
    current_time = time.time()

    if current_time - last_calc_time >= 2:
        elapsed_time = current_time - download_start_time
        current_speed = download_d / elapsed_time / 1024  
        print(f"Current Speed: {current_speed:.2f} kB/s")
        last_calc_time = current_time

def test_proxy(proxy, url):
    global download_start_time, last_calc_time
    download_start_time = time.time()
    last_calc_time = download_start_time

    download_buffer = BytesIO()

    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(c.WRITEFUNCTION, download_buffer.write)
    c.setopt(pycurl.PROXY, proxy)
    c.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
    c.setopt(pycurl.NOPROGRESS, False)
    c.setopt(pycurl.XFERINFOFUNCTION, progress)
    c.setopt(pycurl.TIMEOUT, 5)

    thread = TestProxyThread(c, download_buffer)
    thread.start()

    while thread.is_alive():
        try:
            time.sleep(0.1)
        except KeyboardInterrupt:
            print("Script interrupted by user. Exiting.")
            if not thread.end_time:
                thread.end_time = time.time()  
            thread.join()
            sys.exit(0)

    total_bytes_downloaded = thread.get_total_bytes_downloaded()
    download_time = thread.get_time_taken()

    average_speed = total_bytes_downloaded / download_time  
    average_speed /= 1024  

    return average_speed

proxy_data = fetch_proxies()
url = "http://ipv4.download.thinkbroadband.com/1GB.zip"

try:
    for key, value in proxy_data.items():
        print(f"Proxy: {key}")
        average_speed = test_proxy(value, url)
        print(f"Average Speed: {average_speed:.2f} kB/s")
except KeyboardInterrupt:
    print("Script interrupted by user. Exiting.")
    sys.exit(0)

# This is the method 2:
$ cat rev-2.py 
import subprocess
import time
import pycurl
from io import BytesIO 
import sys

def fetch_proxies():
    command = 'echo "show stat" | sudo socat stdio /var/run/haproxy.sock 2>/dev/null | awk -F, \'$1=="socks5" && !($2~/^(FRONTEND|BACKEND)$/) {print $2,$74}\''
    process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
    output, error = process.communicate()

    result_dict = {}  # Initialize default empty dictionary

    if error is not None:
        return result_dict  # Return the empty dictionary in case of error

    lines = output.decode('utf-8').split('\n')

    for line in lines:
        if line != "":
            key_value = line.split(' ')
            result_dict[key_value[0]] = key_value[1]

    return result_dict


def test_proxy(proxy, url):
    global last_calc_time, download_start_time

    buffer = BytesIO()

    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.WRITEDATA, buffer)
    c.setopt(pycurl.PROXY, proxy)
    c.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
    c.setopt(pycurl.NOPROGRESS, False)
    c.setopt(pycurl.XFERINFOFUNCTION, progress)
    c.setopt(pycurl.TIMEOUT, 5)

    download_start_time = time.time()
    last_calc_time = download_start_time

    try:
        c.perform()
    except pycurl.error as e:
        #print(f'An error occurred: {e}')
        # 输出异常的错误代码和描述
        #print(f'Error code: {e.args[0]}')
        #print(f'Error message: {e.args[1]}')
        if e.args[0]==42:
            #print("Script interrupted by user. Exiting During perform pycurl.")
            sys.exit(0)  # 立即退出程序
        else:
            pass

    average_speed = c.getinfo(pycurl.SPEED_DOWNLOAD) / 1024
    return average_speed

def progress(download_t, download_d, upload_t, upload_d):
    global last_calc_time, download_start_time
    current_time = time.time()

    if current_time - last_calc_time >= 2:
        elapsed_time = current_time - download_start_time
        current_speed = download_d / elapsed_time / 1024
        print(f"Current Speed: {current_speed:.2f} kB/s")
        last_calc_time = current_time

proxy_data = fetch_proxies()
url = "http://ipv4.download.thinkbroadband.com/1GB.zip"

for key, value in proxy_data.items():
    print(f"Proxy: {key}")
    try:
        average_speed = test_proxy(value, url)
        print(f"Average Speed: {average_speed:.2f} kB/s")
    except KeyboardInterrupt:
        print("Script interrupted by user. Exiting.")
        break  # 退出for循环

But based on the test results, the Average Speed is too high, and I don’t know how to correct this issue:

(datasci) werner@X10DAi:~/Desktop$ python rev-1.py 
Proxy: SG_ssr_futeapbquf5m.nodelist.club_1453_018d83c677e05e71f172014fc3f45e39
Current Speed: 0.52 kB/s
Current Speed: 2370.52 kB/s
Average Speed: 4637.05 kB/s
Proxy: HK_ssr_wo8o8npg4fny.nodelist.club_1303_b5bf85111d0f51c517ec7302d3f33ce1
Current Speed: 1458.93 kB/s
Current Speed: 5754.89 kB/s
Average Speed: 6775.81 kB/s
Proxy: SG_ssr_futeapbquf5m.nodelist.club_1354_ddfba110eddfdb7037f389e9b5917477
Current Speed: 2652.49 kB/s
Current Speed: 8449.61 kB/s
Average Speed: 10009.58 kB/s
Proxy: HK_ssr_fwqvvo60u1mj.nodelist.club_1423_dba3b0996744e7b82eae7634626a5ba9
Current Speed: 765.14 kB/s
^CScript interrupted by user. Exiting.
Current Speed: 2911.64 kB/s


(datasci) werner@X10DAi:~/Desktop$ python rev-2.py 
Proxy: SG_ssr_futeapbquf5m.nodelist.club_1453_018d83c677e05e71f172014fc3f45e39
Current Speed: 163.68 kB/s
Current Speed: 6135.73 kB/s
Average Speed: 7998.58 kB/s
Proxy: HK_ssr_wo8o8npg4fny.nodelist.club_1303_b5bf85111d0f51c517ec7302d3f33ce1
Current Speed: 935.88 kB/s
Current Speed: 4845.68 kB/s
Average Speed: 6304.96 kB/s
Proxy: SG_ssr_futeapbquf5m.nodelist.club_1354_ddfba110eddfdb7037f389e9b5917477
^CTraceback (most recent call last):
  File "/home/werner/Desktop/rev-2.py", line 60, in progress
    def progress(download_t, download_d, upload_t, upload_d):
    
KeyboardInterrupt

The previously proposed blessed-based method also has this problem and I still don’t know how to fix it.

Regards,
Zhao

Software timing is always much trickier than people generally assume - so, it’s good to look critically at results and see if they make sense :slight_smile:

In this case, the average speed reported at the end doesn’t match up with the current speed in the callbacks. So, something is wrong in the code - either in your code or in pycurl or both.

To debug this, I would suggest:

  • verify that SPEED_DOWNLOAD really gives the correct value: add your own measurement
  • make the use of progress optional; rerun everything with progress on (as now) and off

To verify SPEED_DOWNLOAD (or verify how to interpret SPEED_DOWNLOAD), you could do your own measurement. You know the total time (wall-clock time) spent in curl (and if progress is off this should be pretty reliable) and you know the total downloaded data size, so you can report that and compare to the SPEED_DOWNLOAD info.

The progress function should be optional since it is a real time sink itself (Python → pycurl C → Python callback → pycurl C → Python main thread; and it’s called pretty often). It’s unclear to me if and how pycurl discounts this in its internal time metrics. The code in the progress function seems ok to me (you do have to use download_d which is the total amount of data downloaded). I’m a bit wary about those global variables – using them implicitly assumes that pycurl itself is also always single-threaded! – but that seems to work out ok too.

If you still get suspicious results after doing your own measurements, then one thing that might also be interfering is (server-side) caching. Don’t know if this is relevant in this case.

Btw - It turns out that someone did a comparison between pycurl and requests, and put that on the GitHub:
GitHub - svanoort/python-client-benchmarks: Microbenchmark of different python HTTP clients. That code could also give you hints about how to dig deeper.

It seems that pycurl is preferable:

Now, I try to compare the average speed calculated manually with the average speed given by pycurl, but the latter is always 0, and the test code is shown below:

$ cat rev-3.py 
import subprocess
import time
import pycurl
from io import BytesIO
import sys
import signal

def sigint_handler(signal, frame):
    print(f"KeyboardInterrupt at line {frame.f_lineno}")
    sys.exit(0)

signal.signal(signal.SIGINT, sigint_handler)

def fetch_proxies():
    command = 'echo "show stat" | sudo socat stdio /var/run/haproxy.sock 2>/dev/null | awk -F, \'$1=="socks5" && !($2~/^(FRONTEND|BACKEND)$/) {print $2,$74}\''
    try:
        process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
        output, error = process.communicate()
    except Exception as e:
        print("Error occurred while executing shell command: ", e)
        return {}

    result_dict = {}

    if error is not None:
        print("Error occurred while getting proxies status: ", error.decode('utf-8'))

    lines = output.decode('utf-8').split('\n')

    for line in lines:
        if line and len(line.split(' ')) == 2:
            key_value = line.split(' ')
            result_dict[key_value[0]] = key_value[1]

    return result_dict

def test_proxy(proxy, url):
    global last_calc_time, download_start_time, total_downloaded_data
    buffer = BytesIO()

    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.WRITEDATA, buffer)
    c.setopt(pycurl.PROXY, proxy)
    c.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
    c.setopt(pycurl.NOPROGRESS, False)
    c.setopt(pycurl.XFERINFOFUNCTION, progress)
    c.setopt(pycurl.TIMEOUT, 5)

    download_start_time = time.time()
    last_calc_time = download_start_time
    total_downloaded_data = 0
    py_speed = 0

    try:
        c.perform()
        py_speed = c.getinfo(pycurl.SPEED_DOWNLOAD) / 1024
    except pycurl.error as e:
        pass
    finally:
        c.close()

    elapsed_time = time.time() - download_start_time
    average_speed = total_downloaded_data / elapsed_time / 1024
    print(f"PycURL Speed: {py_speed:.2f} kB/s")

    return average_speed


def progress(download_t, download_d, upload_t, upload_d):
    global last_calc_time, download_start_time, total_downloaded_data
    current_time = time.time()

    if current_time - last_calc_time >= 2:
        elapsed_time = current_time - download_start_time  # Time since download start
        current_speed = download_d / elapsed_time / 1024   # Speed since download start
        print(f"Current Speed: {current_speed:.2f} kB/s")
        last_calc_time = current_time
        total_downloaded_data = download_d

    return 0  

proxy_data = fetch_proxies()

if not proxy_data:
    print("No proxy data found.")
    sys.exit(1)

url = "http://ipv4.download.thinkbroadband.com/1GB.zip"

for key, value in proxy_data.items():
    print(f"Testing Proxy: {key}")
    average_speed = test_proxy(value, url)
    print(f"Average Speed: {average_speed:.2f} kB/s")

The test results are as follows:

(datasci) werner@X10DAi:~/Desktop$ python rev-3.py 
Testing Proxy: SG_ssr_futeapbquf5m.nodelist.club_1453_018d83c677e05e71f172014fc3f45e39
Current Speed: 75.63 kB/s
Current Speed: 5395.45 kB/s
PycURL Speed: 0.00 kB/s
Average Speed: 4389.14 kB/s
Testing Proxy: HK_ssr_wo8o8npg4fny.nodelist.club_1303_b5bf85111d0f51c517ec7302d3f33ce1
Current Speed: 2138.56 kB/s
Current Speed: 7901.99 kB/s
PycURL Speed: 0.00 kB/s
Average Speed: 6323.33 kB/s
Testing Proxy: SG_ssr_futeapbquf5m.nodelist.club_1354_ddfba110eddfdb7037f389e9b5917477
Current Speed: 937.87 kB/s
Current Speed: 7540.93 kB/s
PycURL Speed: 0.00 kB/s
Average Speed: 6209.84 kB/s
Testing Proxy: HK_ssr_fwqvvo60u1mj.nodelist.club_1423_dba3b0996744e7b82eae7634626a5ba9
^CKeyboardInterrupt at line 70

I don’t know how to fix this problem. Any tips will be appreciated.

Regards,
Zhao