Solaris

UltraSparc T2 is out !!

UT2It’s out !

We are waiting for new SunBlade with T2 too. This should come to the office in 3 weeks. I’ll give you pictures then :)
 
 
 
 

Using the special crypto chip of Sun’s T1 processor (on T1000 and T2000)

I know this is an old improvement now, but as I’m using it once more, I would like to remind people of this, and how to do this.

If you are using Apache, you can have a look a the Sun’s PDF doc.

When compiling apache, first, use the new optimized GCC 4 with CoolThreads support. You will find this by searching « coolthreads » or « cooltools » on Sun’s website.

Just get the 2 packages and install them. You can put then wherever you want, but the easiest way is to keep the 2 packages in the same directory.

Set your path so the bin directory is the first one in your path, so you use the new GCC :

PATH=/opt/tools/gcc-coolthreads/gcc/bin:/opt/csw/bin:/opt/csw/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/xpg4/bin:/usr/ccs/bin:/usr/ucb:/usr/openwin/bin:/usr/sfw/bin:/usr/sfw/sbin:/usr/local/bin:/usr/local/sbin

In this example, my GCC is installedin « /opt/tools/gcc-coolthreads« 

Untar your Apache (2.2.4 for now) and cd into it. Configure and make it with this command line. This should build the MPM prefork module, which is the most efficient on T2000. If not sure, force it to compile.

CFLAGS='-DSSL_EXPERIMENTAL -DSSL_ENGINE -xO4' ./configure --enable-ssl --enable-rule=SSL_EXPERIMENTAL --with-ssl=/usr/sfw --prefix=/opt/monitor/apache-2.2.4  --enable-mods-shared=all  --enable-so
make
make install

This will install Apache in « /opt/monitor/apache-2.2.4« 

You then have to change the config files. Add the Include for httpd-ssl.conf at the end of httpd.conf file.
In the extra/httpd-ssl.conf file, search for « SSLCryptoDevice » lines and replace them with :

SSLCryptoDevice pkcs11
SSLRandomSeed startup file:/dev/urandom 512
SSLRandomSeed connect file:/dev/urandom 512

And there you go. Your apache is now 100 times faster when doing SSL. Just don’t forget to tune others parameters of the config files, like in « extra/httpd-mpm.conf » where you can tune the prefork module to :



ServerLimit 8192
MaxClients 2048

ListenBacklog 8192
ServerLimit 2048
MaxClients 2048
MinSpareServers 20
MaxSpareServers 20
MaxRequestsPerChild 0

StartServers         20

Having DSS compiles on Solaris 10 [part 7] – DSS and Multicast streams

I finaly got DSS working fine on Solaris, with Multicast source on non primary interface.

This modified source-tree should be working with most Solaris 10 and maybe 8, 9, on Sparc or Intel. Just untar it, do a « ./buildtarball myversion » and use the tar.gz file for install. You may also need the Install script I modified to install in /opt/deflector/dss. I hope to find time to have the install dir on the command line so anybody can use it.

You will see a new parameter in streamingserver.xml where you define the IP you wish to subscribe Multicast streams on : <PREF NAME= »ip_listen_for_multicast » >0</PREF>
This parameter is set to 0 by default, which means a default behaviour, using INADDR_ANY as subscription IP. You can set it to an IP on your host to subscribe to a special interface like :
<PREF NAME= »ip_listen_for_multicast » >10.16.240.19</PREF>

It seems another solution would be to change the multicast lan route. Lets say your primary inteface have IP 10.16.248.19 and you have multicast comming to another interface whose IP is 10.16.240.19. You do it like this :

route -n delete -interface 224.0/4 -gateway 10.16.248.19
route -n add -interface 224.0/4 -gateway 10.16.240.19

My patch is not a end by itself as this IP should be, at least, be set per multicast stream. It’s just OK for my need, my lack of time and my small C++ experience.

Just come to me if you want to have more details about the config files or start/stop script.

By the way, the .tar.gz file with patched source is here and a patch file (diff file to apply using gpatch on Solaris) for the original source code is here

Having DSS compiles on Solaris 10 [part 6] – DSS and Multicast streams

And here we go again with DSS !
This time it’s not a real compilation problem and is not only Solaris related.

The best thing you can do with DSS, when you own the source part of the stream, is to use multicast. This way, you stream once from your encoder, and many DSS servers can access the stream and reflect it as unicast to your customers.

Once said, this sounds good and simple. In fact, not that simple if your DSS server is « multi homed ».

Multi homed is the way you say your server is using multiples IP on different networks, VLAN and/or NICs.

Our servers une MANY networks. One on the first NIC (nic 0) for admin tasks. NICs 1 and 2 are aggregated to have 2Gb bandwidth for data. On this Aggregated interface, we have many IP set on different networks and VLANs.

To cut the story short, the multicast stream comes on interface aggr543510 (vlan 543 on aggregated interface 510). Clients connect to admin NIC for test, and aggr536510 (vlan 536).

By default, DSS, when reflecting a multicast stream, bind any IP on the port supplied in the SDP file and subscribe to the multicast stream. This is pretty usual, and any other application would do the same.
The troubles comes when you are using multiple NICs. The bind is effectylevly done on * (any IP), but the multicast subscribe is done on ONLY ONE INTERFACE !

Let’s have a netstat -g (works on Solaris, on OS X, use netstat -a -i and try to find out the info on your own)

# netstat -g
Group Memberships: IPv4
Interface Group                RefCnt
--------- -------------------- ------
lo0       ALL-SYSTEMS.MCAST.NET      1
e1000g0   224.1.1.9                 1
e1000g0   ALL-SYSTEMS.MCAST.NET      1
e1000g34001:1 ALL-SYSTEMS.MCAST.NET      1
e1000g34001:2 ALL-SYSTEMS.MCAST.NET      1
e1000g36001:1 ALL-SYSTEMS.MCAST.NET      1
e1000g36001:2 ALL-SYSTEMS.MCAST.NET      1
e1000g244001 ALL-SYSTEMS.MCAST.NET      1
e1000g543001:1 ALL-SYSTEMS.MCAST.NET      1

The important line is the one with « e1000g0 224.1.1.9 « . There you can see that Multicast Network 224.1.1.9 is subscribed on interface e1000g0.
But e1000g0 is our admin interface !!!

So DSS will never see any data to reflect. And your client will be set to pause.

I Called Sun Support yesterday to have them comment about this. As I said, this is not really DSS related. It comed from the way the OS (every UNIX os ?) deals the multicast subscription. No news yet. I’m pretty I’ll be told that it is a normal behaviour.

Once that said, what will I do to have DSS work on my architecture ?

The key lies in CommonUtilitiesLib/UDPSocket.cpp
YES, we are back again to C++ programming !

The original source code says :

OS_Error UDPSocket::JoinMulticast(UInt32 inRemoteAddr)
{
    struct ip_mreq  theMulti;
        UInt32 localAddr = fLocalAddr.sin_addr.s_addr; // Already in network byte order

#if __solaris__
    if( localAddr == htonl(INADDR_ANY) )
         localAddr = htonl(SocketUtils::GetIPAddr(0));
#endif

    theMulti.imr_multiaddr.s_addr = htonl(inRemoteAddr);
    theMulti.imr_interface.s_addr = localAddr;

    int err = setsockopt(fFileDesc, IPPROTO_IP, IP_ADD_MEMBERSHIP, (char*)&theMulti, sizeof(theMulti));
    //AssertV(err == 0, OSThread::GetErrno());
    if (err == -1)
         return (OS_Error)OSThread::GetErrno();
    else
         return OS_NoErr;
}

There you can see 2 things :

  • There is a special part for Solaris
  • The interface (the IP) to bind to and to subscribe is set here
  • As I already said I’m not C++ expert. I just changed the code so the IP to subscribe, defined in « localAddr », is set to the IP I need to subscribe on :

    OS_Error UDPSocket::JoinMulticast(UInt32 inRemoteAddr)
    {
        struct ip_mreq  theMulti;
            UInt32 localAddr = fLocalAddr.sin_addr.s_addr; // Already in network byte order
    
    /*
    #if __solaris__
        if( localAddr == htonl(INADDR_ANY) )
             localAddr = htonl(SocketUtils::GetIPAddr(0));
    #endif
    */
    
    // Set by Prune - 20070712
        localAddr  = htonl(0x0a10f07b);
    
        theMulti.imr_multiaddr.s_addr = htonl(inRemoteAddr);
        theMulti.imr_interface.s_addr = localAddr;
    
    // Set by Prune - 20070712
    qtss_printf("Multicast local %x\n", localAddr);
    qtss_printf("Multicast remote %x\n", htonl(inRemoteAddr));
    
        int err = setsockopt(fFileDesc, IPPROTO_IP, IP_ADD_MEMBERSHIP, (char*)&theMulti, sizeof(theMulti));
        //AssertV(err == 0, OSThread::GetErrno());
        if (err == -1)
             return (OS_Error)OSThread::GetErrno();
        else
             return OS_NoErr;
    }
    

    So the local IP is set to « localAddr = htonl(0x0a10f07b); » where 0a10f07b is the hexa value of IP 10.16.240.123
    I also add some debug that you can see if you start DSS with the -d option.

    Then everything is OK and you have the right interface subscribed :

    # netstat -g
    Group Memberships: IPv4
    Interface Group                RefCnt
    --------- -------------------- ------
    lo0       ALL-SYSTEMS.MCAST.NET      1
    e1000g0   ALL-SYSTEMS.MCAST.NET      1
    e1000g34001:1 ALL-SYSTEMS.MCAST.NET      1
    e1000g34001:2 ALL-SYSTEMS.MCAST.NET      1
    e1000g36001:1 ALL-SYSTEMS.MCAST.NET      1
    e1000g36001:2 ALL-SYSTEMS.MCAST.NET      1
    e1000g244001 ALL-SYSTEMS.MCAST.NET      1
    e1000g543001:1 224.1.1.9                 1
    e1000g543001:1 ALL-SYSTEMS.MCAST.NET      1

    OK, I have to do it better, like getting the interface to bind from the config file. This was just a late night test :)

    But what is the right way to do it ?
    The server (DSS process) can’t know what interface to subscribe to. Does it have to subscribe it on all interfaces ? This does not seem possible other than opening many sockets. So the server will then have to decide which one to use. tricky isn’t it ?

    That is where I am actualy. No real good solution but a hack to have it working.
    Then, what will happen if one day I want another multicast stream coming from another network ? …
    Keep reading here as I’ll post any information I have if Sun or DSS mailing list answer me.

    Sun Streaming Server

    Sun Streaming Server
    While the software part of the Sun Streaming Server will be released soon in OpenSolaris, Sun is selling a full packaged IPTV solution.
    Based on Sun servers 4100 and 4500, they also provide a big switch. You can check this out here : http://www.sun.com/servers/networking/streamingsystem/
    Check the PDFs on the right to uderstand how the claim to support up to 160.000 streams of 2Mbps :)

    Having DSS compiles on Solaris 10 [part 5]

    This should be the last part :)

    In fact, it is not the call to OS::GetNumProcessors() in Server.tproj/RunServer.cpp which is causing the problem. This call is done only once.

    It’s the one in Server.tproj/QTSServerInterface.cpp.
    In fact this part of the code is run every second for server statistics, leading to a problem with the slow unix command ‘uname -X’ and parsing of the output.

    The best solution would be that the number of CPU is cached somewhere, so the server does not have to get it each time it have to divide the CPU usage !
    :)

    One solution, as I said earlier, is to change this function to use something better than a unix command.
    The other solution is to have the number of CPU cached.
    The last solution is to remove the call to this function in Server.tproj/QTSServerInterface.cpp

    If you do that you may end with a CPU load over 100%. But your server will not crash at 200 streams :)

    I filled a bug at Apple and I hope it will be taken into account :
    5316778
    DSS crash on Solaris 10 after 200 streams

    Having DSS compiles on Solaris 10 [part 4]

    Will this stop one day ?
    I just came with a new patch which keeps the dynamic stuff of the « uname -X » trick but use a simple library to check the number of CPU. Low cost and efficency guarantied.
    This code was submited by Fabrice Aneche which is a better coder than I am and who took the time to dig this out while I was drinking wine and eating french cheese :)

    This patch can be used like :

    patch -p 1 < patch-dss.diff
    
    --- OS.cpp      Fri Jul  6 17:33:39 2007
    +++ OS.cpp.prune        Fri Jul  6 17:32:07 2007
    @@ -69,10 +69,14 @@
    #include <sys/sysctl.h>
    #endif
    
    -#if (__solaris__ || __linux__ || __linuxppc__)
    +#if (__linux__ || __linuxppc__)
    #include "StringParser.h"
    #endif
    
    +#if (__solaris__)
    +       #include <unistd.h>
    +#endif
    +
    #if __sgi__
    #include <sys/systeminfo.h>
    #endif
    @@ -410,37 +414,8 @@
    #if(__solaris__)
    {
    UInt32 numCPUs = 0;
    -    char linebuff[512] = "";
    -    StrPtrLen line(linebuff, sizeof(linebuff));
    -    StrPtrLen word;
    -
    -    FILE *p = ::popen("uname -X","r");
    -    while((::fgets(linebuff, sizeof(linebuff -1), p)) > 0)
    -    {
    -        StringParser lineParser(&line);
    -        lineParser.ConsumeWhitespace(); //skip over leading whitespace
    -
    -        if (lineParser.GetDataRemaining() == 0) // must be an empty line
    -            continue;
    -
    -        lineParser.ConsumeUntilWhitespace(&word);
    -
    -        if ( word.Equal("NumCPU")) // found a tag as first word in line
    -        {
    -            lineParser.GetThru(NULL,'=');
    -            lineParser.ConsumeWhitespace();  //skip over leading whitespace
    -            lineParser.ConsumeUntilWhitespace(&word); //read the number of cpus
    -            if (word.Len > 0)
    -                ::sscanf(word.Ptr, "%lu", &numCPUs);
    -
    -            break;
    -        }
    -    }
    -    if (numCPUs == 0)
    -        numCPUs = 1;
    -
    -    ::pclose(p);
    
    +    numCPUs = sysconf(_SC_NPROCESSORS_ONLN);
    return numCPUs;
    }
    #endif
    

    Having DSS compiles on Solaris 10 [part 3]

    I didn’t thought I would have to come back on this installation. But here we are with Part 3 !!
    If you tried to compile on Solaris and stream for more than 200 users, you experienced a crash. I thought the trick from part 2 was solving this, as my production server was running fine on more than 1000 users (test only :)

    While I was documenting the install process, I was also deploying new servers, with fresh compile and install. And I found that the menly compiled server was crashing at 200+ streams !!!!

    DAMN !

    Going back to my « working » compile, trying to diff between original source and patched one, I just found that I changed something else.
    There is a small (ugly ?) function somewhere which tries to detemine how many CPU the server have. On solaris this function seems to start a « uname -X » for each incoming user !

    My patch was to comment out all this and have this function return a static number (I don’t know why I set it to 24 :) )

    For those of you needing it, here is the patch (in diff -u format) :

    # diff -u DarwinStreamingSrvr5.5.5-Source-test/CommonUtilitiesLib/OS.cpp DarwinStreamingSrvr5.5.5-Source/CommonUtilitiesLib/OS.cpp
    — DarwinStreamingSrvr5.5.5-Source-test/CommonUtilitiesLib/OS.cpp Wed May 18 10:01:14 2005
    +++ DarwinStreamingSrvr5.5.5-Source/CommonUtilitiesLib/OS.cpp Wed Jul 4 16:21:42 2007
    @@ -410,6 +410,8 @@
    #if(__solaris__)
    {
    UInt32 numCPUs = 0;
    +/* comment by Prune
    +
    char linebuff[512] = «  »;
    StrPtrLen line(linebuff, sizeof(linebuff));
    StrPtrLen word;
    @@ -440,7 +442,8 @@
    numCPUs = 1;
     
    ::pclose(p);
    -
    + */
    + numCPUs = 24;
    return numCPUs;
    }
    #endif

    Set the number of CPU to whatever you wish, 2 or 8 may be a good number. This number can be set in the config file, using the « 5 » parameter.

    I think that’s it, you have everything you need now :)
    DSS-1543
    Let start streaming !

    Link aggregation versus IP Multipathing

    Sun has implemented not one, but two ways to have redundant and heavy network load. One is Trunking, made with dladm, the other is Multipathing. They don’t work at the same level, and Nicolas Droux tell us the main differences here.

    The right solution, for rich people, would be to use bothsysteme : have 2 trunk to 2 different switches, then use multipathing for redundancy.

    Check the other articles of this blog, they contain a lot of informations.

    Dtrace on Solaris

    Dtrace is the new Solaris / OpenSolaris tool replacing top, prstat, vmstats,…
    Following the article on Akhen’s blog, here are more URLs with a lot of scripts and examples :

    http://www.brendangregg.com/dtrace.html#Examples
    http://www.solarisinternals.com/si/dtrace/index.php
    http://www.context-switch.com/performance/dtrace.htm

    http://www.sun.com/bigadmin/content/dtrace/

    And the DtraceToolKit itself