Performance of python scripts

12 thoughts on “Performance of python scripts

  1. Hello,

    I need connect GL865 with existing device with given communication protocol. I wrote simple Python script to do it, but I came across problem with performance of python script. My communication always timed out, because checking of received packet  takes too long time. I wrote simple example, with calculation simple xor of all characters in simulated (received) string and I used logic analyser and GPIO pin to measurement time needed for calculation. The times with AT+CPUMODE=0 were around 6 seconds and with AT+CPUMODE=1 they were around 4 seconds. Acceptable times for me are below 1 second. Is there any way how to speed up packet processing?

    Thanks.

     

    Example:

     

     

    import MDM
    import MOD
    import GPIO

    print 'Test started.'

    GPIO.setIOdir(1,0,1)               # set GPIO pin as output
    MDM.send( 'at#cpumode=1\r', 0 )    # set fast clocks
    MOD.sleep( 30 )                    # wait time before test
     
    packet = '\x00'*1000               # create simulated packet

    for j in range( 10 ):
        GPIO.setIOvalue(1,1)           # set GPIO pin high
        lrc = 0x00
        for i in packet:               # calc XOR from whole packet
            lrc = lrc ^ ord( i )
        GPIO.setIOvalue(1,0)           # set GPIO pin low
        MOD.sleep( 10 )                # dummy delay before next run

    1. If your scenario need such calculation speed maybe you need to redesign the project, with fewer processing tasks or an external processor; maybe you can describe a little more the project we can help with other (more useful) ideas?

       

    2. No, you have already carried out  the improvement of the performance  with the at+cpumode command.  You can read also the chapter Limits in the Easy Script user guide. I agree to Cosmin’s comments

  2. Are there any benchmarks available somewhere? We are using a GL865 which we intended to use as a bridge between our microcontroller and the web, letting it handle pushing data to a cloud service. However, we are seeing extremely poor performance when reading the data sent by the MCU. Iterating on a string of 100 bytes can take 2-4 seconds with very little logic in between.

     

    Edit: After optimizing the data handler by pre-allocating an array to the specified size instead of appending to it I have managed to scare it up to 100 bytes/s. Still far too slow.

    1. Remember the Python engine has the lowest task priority, along with the speed you observe you have another drawback, this speed cannot be guaranteed and can vary very much.

      All said above applies!

       

      1. Remember the Python engine has the lowest task priority, along with the speed you observe you have another drawback, this speed cannot be guaranteed and can vary very much.

        All said above applies!

         

        We can live with some choppyness. I was just expecting a throughput 10-100 times this. The way it is now severely limits what it can be used for.

         

        That being said, is there anything we can do to boost this besides setting CPUMODE to 1?

        – Can we disable anything in the modem to get more CPU time?

        – Is there anything we should stay away from? Loops, jumps, nested if statements, dot access etc?

        – Can we execute C code from Python somehow?

        1. 10-100 times is a lot related to the performance of the module you are measuring.

          -YOu could try to see if something changes with at#cpumode=3

          – regarding the critical coding you could send your source code to Cosmin  that could contact Telit R&D also, But I think you can not get a throughput improvement of a factor 10 .

           

          Please explain in details your project and which are your detailed expectations for the throughput.
           
          Consider also that the UL865 product will be available in next future  . It has a better Python performance, but you will need to test it in your application, before to order high quantities

  3. 10-100 times is a lot related to the performance of the module you are measuring.

    -YOu could try to see if something changes with at#cpumode=3

    – regarding the critical coding you could send your source code to Cosmin  that could contact Telit R&D also, But I think you can not get a throughput improvement of a factor 10 .

     

    Please explain in details your project and which are your detailed expectations for the throughput.
     
    Consider also that the UL865 product will be available in next future  . It has a better Python performance, but you will need to test it in your application, before to order high quantities

    10-100 was just a gut feeling. I expected the serial port throughput to be the bottle neck, not iterating over a string 🙂 I think I tried CPUMODE=3 but IIRC it gave me an error without an error code. I can double check that when I get back to the office tomorrow. I’ll throw the troubling code to Cosmin so he can have a look.

     

    Our project consists of an AVR32 platform which collects data from attached sensors. This data is serialized to protobuf and the intention is to upload the data to a server. The idea is to let the AVR32 MCU buffer the data to a flash chip until a certain amount (say 1k) has been reached and a GSM connection has been established. The data will then be passed to the Python scripts on the modem and they can take care of uploading the data in the background. The data sent from the MCU to the GL865 modem looks as follows:

    // Size of each field is 1 byte, except for the payload
    [start][type][id][size][payload…][end]

    This is a HDLC like data frame where two known delimiters are used for start (0x7E) and stop (0x7D). The data inbetween is escaped with an extra byte if the payload consists of one of these control bytes. Reading a transmission (around 120 bytes in my test) from SER is instantaneous, iterating through it and "decoding" the data takes 1-2 seconds.

     

    I want to be able to process 1kByte of data in under 1 second. 

    1. I want to be able to process 1kByte of data in under 1 second. 

      It’s not possible to get this performance with Python running on this platform.

      Please post the firmware version (at+cgmr).

      "at#cpumode=3" is supported on recent firmware version. You could have an improvment, but you can not obtain the performance written above. As written in specs and in the posts, Python task has the lowest priority.

      1. CPUMODE=3 is not supported on our units. They don’t have the latest firmware version but I can’t check the exact one now (no unit hooked up).

         

        I understand that the Python task is running in the lowest priority and I was not expecting to get any sort of high performance out of it. I was however expecting it to run a simple, non math, non allocating loop at a reasonable speed. The documentation should come with a simple loop example so the users can get a rough idea of just how slow this really is.

         

        Anyway, we are currently experimenting with a UL865 device to see if it runs any faster. If it doesn’t we’ll need to ditch Python altogether. 

  4. I just got the UL865 up and running after a minor porting session, and the difference is like night and day. Where the GL865 struggled to decode data in 150 bytes/sec the UL865 happily chews the data at 4500 bytes/s. This is more than enough for our use 🙂