Friday 25 March 2011

Staring out the Windoze

My usual day to day development environment is a mac, and I usually write code using Qt / g++ so it's easy to port it to Linux which is the main development system at the University. However most of the students seem to use Windows as their main computer system so I had to really support windows with my NGL library.

Luckily the Windows version of the QtSDK comes with a g++ compatible compiler called mingw and once the other libraries were installed in the correct places the library seems to compile and link into a DLL.

Problems with GLEW


The first problem I encountered was with the GLEW library. The versions available for download are compiled with the visual studio compiler and are not compatible with the mingw compiler / linker. To make this easier to use and integrate into the ngl library I added the source of the latest GLEW tree to the ngl project and this is included into the windows branch of ngl automatically in the Qt project file. This is done using the following code


win32: {
        LIBS += -lmingw32
        LIBS += -lstdc++
        DEFINES += WIN32
        DEFINES += USING_GLEW
        DEFINES +=GLEW_STATIC
        INCLUDEPATH += C:/boost_1_44_0
        INCLUDEPATH += C:/boost
        LIBS += -L/lib/w32api
        DESTDIR = c:/
        SOURCES+=$$BASE_DIR/Support/glew/glew.c
        INCLUDEPATH+=$$BASE_DIR/Support/glew/
        DEFINES+=_WIN32
        CONFIG+=dll
        DEFINES+=BUILDING_DLL

}

Once this code is in place, the windows version of Qt will set the win32 specific flags and compile correctly and also include the glew elements statically into my library and hence reducing a dll dependancy.

Cross Platform Development
Some other conditional code was also required to make things compile correctly on all platforms. NGL uses 3 flags to set specific platform dependent includes and flags. These are found in the .pro file as part of the qmake process and the following shows these

linux-g++:{
          DEFINES += LINUX
          LIBS+=-lGLEW
}

linux-g++-64:{
          DEFINES += LINUX
          LIBS+=-lGLEW
}
macx:DEFINES += DARWIN


So not the compiler will pass on the command line -D[DARWIN][LINUX][WIN32] dependent upon the platform being used. This is then used to switch in the main header files for the libs, and to place this all in one include file the <ngl/Types.h> is used and has the following conditional compilation code

#if defined (LINUX) || defined (WIN32)
  #include <gl/glew.h>
  #include <gl/gl.h>
  #include <gl/glu.h>
#endif
#ifdef DARWIN
  #include <unistd.h>
  #include <opengl/gl.h>
  #include <opengl/glu.h>
#endif

With all this in place the library built with no problems and a .dll / .a file was generated under windows and a simple demo application would run.  I left the library under windows at this stage and continued the development under mac, however I began to get reports of things not working under windows. Initially I thought it was a driver / gpu problem as it seem to occur in the shader manager class.

When things go wrong

The first indications of things not working was when a lookup to a shader returned the null program which is a safety mechanism. The following code shows the method and the call to the method.

ShaderProgram * ShaderManager::operator[](
                      const std::string &_programName
                     )
{
  std::map <std::string, ShaderProgram * >::const_iterator program=m_shaderPrograms.find(_programName);
  // make sure we have a valid  program
 if(program!=m_shaderPrograms.end() )
  {
  return  program->second;
  }
  else
  {
    std::cerr<<"Warning Program not know in [] "<<_programName.c_str();
    std::cerr<<"returning a null program and hoping for the best\n";
    return m_nullProgram;
  }
}

// in application
ngl::ShaderManager *shader=ngl::ShaderManager::instance();
(*shader)["Blinn"]->use();

It appeared at first that the map find method was at fault as it was not finding a shader that I knew was in the class.  However I started to print out the size and contents of the map I noticed that in some cases the map contained the correct values and in other cases it was empty. Why would this be?

This shouldn't happen!

It seemed weird that this was happening as the class was based on a singleton pattern so there should only be one instance of the class, also this bug only appeared in the windows version so I was unsure what was going on.

I placed a number of debug statements in the singleton template class used for the shader manager and discovered that the constructor was being called twice WTF!

My initial response was that I was having a weird threading issue and my singleton wasn't thread safe, so I added a QMutex / QMutexLocker construct around the singleton and also made it inherit from boost::noncopyable so that it could not be copied.

This however was not the solution to the problem as it still continued.

Digging Deeper

After digging deeper into the code I traced exactly where the rouge call to the singleton ctor came from. In my TransformStack there is a convenience method to load the current Transform into a parameter of the shader, the code is actually in the ngl::Transform class and looks like this.

void Transformation::loadMatrixToShader(
                                        const std::string &_shader,
                                        const std::string &_param,
                                        const ACTIVEMATRIX &_which
                                       )
{
  computeMatrices();
  ShaderManager *shader=ngl::ShaderManager::instance();
  switch (_which)
  {
    case NORMAL :
    {
      shader->setShaderParamFromMatrix(_shader,_param,m_matrix);
    }
    break;
    case TRANSPOSE :
    {
      shader->setShaderParamFromMatrix(_shader,_param,m_transposeMatrix);
    }
    break;
    case INVERSE :
    {
      shader->setShaderParamFromMatrix(_shader,_param,m_inverseMatrix);
    }
    break;
  }
}
This method actually lives in the library, and hence under windows in the DLL, and this is where the problem begins.

Separate compilation units
It seems in the windows DLL the library data is in a different scope when constructed unlike the unix / mac shared libraries. In this case when the dll invokes the shader manager singleton, it doesn't exist in the DLL memory space even though it has been called in the main application linking to the DLL. In other words the singleton is not shared by default. There is a good post here about it.

To test this was the case I quickly created a static global variable, within the transform class which could be set outside the DLL module by a quick method call. This basically passed in the application side instance of the Shader manager class, and used that in the call, and it fixed the problem. However Rob the Bloke pointed out the error of my ways by saying

"That's a critical bug and refactor waiting to happen. If you need to share something across DLL boundaries, there is exactly one way to achieve that. Your solution fails the second anyone adds another DLL that links to ngl (i.e. you've just offset the problem for your students to deal with later). s_globalShaderInstance *needs* to be DLL exported."
Making it a "proper" dll

It was now time to bite the bullet and do a major re-factor of the code to make all the classes export from the DLL properly, I've never done this before and the initial attempts were somewhat problematic, however the basic approach is as Rob said

"if you need to access something from a DLL, it needs to be DLL_EXPORTED. End of story" 
This code would have to be in the windows version, and depending upon wether we are building the library (DLL) or using the library we would need to either __dllimport or __dllexport our classes.

So we need some pre-compilation glue to make this work, first under windows only then dependent upon if we are building the dll or using the dll. The following code was added to the Types.h class

#ifdef WIN32
  #ifdef BUILDING_DLL
    #define NGL_DLLEXPORT __declspec(dllexport)
  #else
    #define NGL_DLLEXPORT __declspec(dllimport)
  #endif
#else
    #define NGL_DLLEXPORT
#endif

Once this has been added to the types file we need to add the BUILDING_DLL define to each of the classes. This is shown below.

class NGL_DLLEXPORT Vector
{
 ...
};

Once this was added it seemed to work for all the classes in the basic demo. However some of the more advance demos still failed to compile. I have several functions in the library which also needed to be exported, this was done as follows

// Util.h
extern NGL_DLLEXPORT ngl::Real radians(
                                        const Real _deg
                                       );
// Util.cpp
NGL_DLLEXPORT Real radians(
                            const ngl::Real _deg
                            )
{
  return (_deg/180.0f) * M_PI;
}

And finally any extraction / insertion operators as they are friend classes will also need to be exported

friend NGL_DLLEXPORT std::istream& operator>>(std::istream& _input, ngl::Vector &_s);

NGL_DLLEXPORT std::istream& operator>>(
                                                 std::istream& _input,
                                                 ngl::Vector& _s
                                                )
{
  return _input >> _s.m_x >> _s.m_y >> _s.m_z;
}

Finally it's working !

Or so I thought,  some of the simple programs that only used the ngl library compiled and ran correctly, however one of the programs which made a call to glUseProgram(0) in the main application failed to compile, due to a linker error, this error said that the method was not found and gave an imp__glew....  style error which means that the actual function call could not be found.

I thought that I had statically bound GLEW into my lib so it would work, however GLEW basically binds method calls to the driver once the glewInit function is called and in this case this is called in the DLL but no my application, to overcome this problem I need to put the glew.c source code into my applications as well the library.

To do this the following is added to the application .pro file (as the glew.c file is shipped with ngl)

win32: {
        DEFINES+=USING_GLEW
        INCLUDEPATH+=-I c:/boost_1_44_0
        INCLUDEPATH+=-I c:/boost

        INCLUDEPATH+= -I C:/NGL/Support/glew
        LIBS+= -L C:/NGL/lib
        LIBS+= -lmingw32
        DEFINES += WIN32
        DEFINES += USING_GLEW
        DEFINES +=GLEW_STATIC
        DEFINES+=_WIN32
        SOURCES+=C:/NGL/Support/glew/glew.c
        INCLUDEPATH+=C:/NGL/Support/glew/
}


And finally we need to initialise glew in both the DLL and Application

ngl::NGLInit *Init = ngl::NGLInit::instance();
#ifdef WIN32
  glewInit();
#endif
Init->initGlew();

This is not an ideal solution but is quite easy to do and saves the hassle of building the glew library using Qt / mingw. I will at some stage do this but for now this quick hack will suffice.
And for my next trick
Now that I've sullied my mac by installing windows I think I need to do a bit more windows support, the next stage is going to be building a Visual Studio version of ngl. Once this is done I think I may also add a DirectX back end to it as well however for now I will stick to the mac ;-)

2 comments:

  1. Hello,

    I have tried to build the NGL lib on Windows. The best approach thus far was converting the .pro to a visual studio project and building that.

    However once I try to build a sample project using the LIB all things go to hell...

    This problem is in the Singleton class (Singleton.h). There I can see that we are trying to export the whole class rather than the methods of the class? This is no problem during lib/dll building but when trying to import the lib inside a project all I get is this error:

    Error 1 error C2491: 'ngl::Singleton::Singleton' : definition of dllimport function not allowed c:\ngl\include\ngl\Singleton.h 77 1 ConsoleApplication1

    From what I can see in the class only the dllexport part is covered... I was wondering how can we fix this? Or better yet how can we succesfully build a NGL demo on windows using the NGL dll/lib... It seems there aren't any examples or Howtos around...

    Best Regards

    ReplyDelete
  2. What compiler are you using I will have to investigate. I use mingw under windows but have not tried to build it for a while

    ReplyDelete