Posts Tagged scholarlycommunication

Newbie’s Experience Setting Up a Pinax Site

As I mentioned last week, I’m using Pinax to test a new website for my research lab.  The site is now in private alpha, which means I’m getting some internal users to log in, test it out, and give me feedback, and I don’t think it will fall over.  I’m writing up my experience to 1) help others get started quickly, and 2) start talking about how the pinax community can get people ramped up faster.

My background:

  • Journeyman Python Programmer. Can independently write code, diagnose most problems.
  • Journeyman HTML/CSS/Javascript. Can independently write code, diagnose most problems.
  • Some experience doing apache configuration, though not recently. No experience with mod_python or mod_wsgi. I largely rely on documentation and examples found on the interwebs.
  • No major prior experience with django, and none deploying it via Apache/mod_{python,wsgi}. I almost exclusively rely on documentation and examples found on the web (although my python experience helps)

The nice thing about pinax is that it has an active community; in particular an active IRC channel (#pinax on freenode). The pinax maintainers have been there, patiently answered my newbie questions, and got me going.

Step 1 – Get the base pinax software running.

Download and install django and pinax.   I first did this on a WinXP desktop. Crap. The templates aren’t rendering properly. I try commenting out the offending lines in the templates, but problem after problem comes up. Jump on #pinax.  Did you do seed the database with manage.py syncdb?   Uhh, no.  I do that, but still having problems.  Back to #pinax.  Are you running WinXP?  Why yes!  Oh, pinax on Windows has a known bug (which really is caused by one of the external apps they rely on).  Copy some files to a new location, or install on an OS which supports soft links.  I choose door 2, recheckout on my Ubuntu machine, sync the database, run the django dev server ( manage.py runserver) and we’re good to go.

Step 2 – Make sure everything in the base site works as expected.

The pinax project comes with a complete working example under the directory projects/complete_projects.  I copy this to a new directory so that I can refer back to the complete_project in case I screw something up.  This means you have to change the ROOT_URLCONF in the settings.py

Added some users via the admin interface…looks great.  Added an account via the Sign-Up page.  Accounts are created (with or without OpenID), but no confirmation emails are being sent.  Hmm  Check out the included django-mailer app, which uses the same configuration as the standard django.core.mailer.  We need to add some settings to settings.py: EMAIL_HOST, EMAIL_HOST_USER, EMAIL_HOST_PASSWORD.  I set these to use my gmail account to do email.  But really, it’d be better to have the emails coming from the local machine, and I really don’t want to hardcode my gmail password into the config file.  I need sendmail on this machine (sudo apt-get install sendmail).  Now that the local machine has its own SMTP server, I can remove the EMAIL_* settings from settings.py.

Hmm…emails still not going out.  Let’s look at the django-mailer docs.  Oh, I see…email messages aren’t actually sent until manage.py send_mail is run from the command line.  The recommended solution is to add a couple lines to the system crontab so that they are run regularly (this is now much clearer in the pinax deployment docs).  Now we’re sending emails!

Step 3 – Configure templates, images, and site text for our lab.

Ok…first thing is that the default site says ‘Pinax’ everywhere.  I need to change that to the name of our lab, GROK Lab.  There’s an outstanding ticket to use a SITE_NAME variable in settings.py which would be displayed throughout the site.  Until that gets resolved, we just have to find all the hardcoded [Pp]inax strings are replace them with the SITE_NAME variable.  I just used grep to find them all, and then I replaced the ones I needed to replace (I left things like Built with Pinax, About Pinax, James Tauber is the awesomest creater of Pinax ever, etc).  Pinax has several translations available, which I think means that I should have been making most of my changes to externalized text strings in locale/en/LC_MESSAGES.  But I didn’t, and just changed the template files.  I’m not sure what will happen when another language is selected.  Should learn more about django internationalization.

Update 11/14: Fernando Correia did some more digging on how to properly externalize strings in django.

One particular place this got me into trouble was, again, the django-mailer app.  Invitations to join the site are generated via two templates: templates/friends/join_invite_message.txt and templates/friends/join_invite_subject.txt.  I wanted the subject of my email to say “You have been invited to join the new SITE_NAME.”  I tried to use the variable substitution syntax {% site_name %} but got an exception that I couldn’t use that tag inside a trans tag.  Read more django docs to learn about template tags and internationalization.  I also tried changing the join_invite_subject.txt to use {%blocktrans %}, similar to the other template file in that directory.  Oops.  I accidentally added newlines to the subject template, which django-mailer does not like.  I revert to the original join_invite_subject file, and simply hardcode ‘GROK Lab’ into the subject string.  Hmm…still getting the same error messages about newlines in the header, even though I’ve changed the template.  Oh, the old messages with the bad header are still in the email queue…its trying to send those, not render new ones based on this template.  I clear out the emails with the bad header out of the database via the admin interface, and email is working again.  Whew…

The logo image says Pinax.  That’s no good…I want to replace that with our lab logo.  Everything is nicely styled with CSS.  My image is a slightly different shape than the pinax logo, so I need to change the height/width in the base.css #tabhead .logo img selector to match my logo file (I made our logo the same height as the original pinax logo to simplify).

This is supposed to be a serious research site, so we probably don’t need games (although I played some on cloud27 — they’re pretty fun, and might be a nice way to encourage a more active community).  I commented out the arcade and games apps in the settings.py, all the related ARCADE_* variables, and the URLpattern in urls.py.  Oops…many of the templates expect to have the arcade app installed (for example, the navigation tabs are hard coded).  Use grep again to find all mentions of ‘arcade,’ and comment them all out.

Finally, there were some functionality that the default installation hides if the user is not logged in.  I wanted to be able to have some read views of the people, projects, blogs, tweets available to anyone who visited the site, but only allow writing if they were logged in.  This involved again grepping through the templates for the right section of code, and adding {% if user.authenticated %} … {% endif %} tags.

Step 4 – Configure site to run through Apache instead of the dev server.

So I had the site up and running using the django development server on the default port 8000.  But I needed it deployed through apache so that in case of a server restart, I did not have to restart the site by hand.  The most commonly referenced post for deploying pinax is here.  It gives two sample configurations depending on whether you want to use apache modules mod_wsgi or mod_python.  Which one do I want to use?  What’s the difference between the two?  No idea.  Look up django docs on deployment; it seems to recommend mod_python, so I’ll try that. I wrestle with that for a few hours, but little success.  Back to #pinax, where the consensus is mod_wsgi good, ignore mod_python.  Ok, after another few minutes of messing with my apache site config file, and the pinax.wsgi file, we got it up and running.  A few gotchas:

  • Added the appropriate WSGIScriptAlias to my apache site config.  I could view the site; however, I got errors that apache didn’t have permissions to write to the db.  Added WSGIDaemonProcess to the site config and that fixed it.
  • I could access the admin site, but it did not have the correct styles/media.  Needed to add an Alias and Directory commands to the apache site.
  • Changed the name of the pinax.wsgi to match our site name; need to change the DJANGO_SETTINGS_MODULE variable inside this file

So the appropriate section of my apache VirtualHost configuration looks like this:

# attempt to setup pinax
WSGIScriptAlias / /path/to/your/pinax/project/deploy/pinax.wsgi
WSGIProcessGroup name
WSGIDaemonProcess name user=unixusername group=unixgroup threads=25

Alias /media/ "/path/to/django/contrib/admin/media/"
<Directory "/path/to/django/contrib/admin/media/">
Order allow,deny
Options Indexes
Allow from all
IndexOptions FancyIndexing
</Directory>

When debugging this, it was helpful to watch all the appropriate log files to see what error messages were being generated where: tail -f apache.log apacheerror.log projectsend_mail.log project.send_mail_deferred.log.

Conclusions

Clearly one area where we can improve pinax is the installation experience.  Compare it to the wordpress install experience…download into a web accessible directory, open a web page, short configuration, done.

More difficult to deal with is the fact that, in order to use pinax, you need to acquire at least a passing knowledge of django convention and configuration, apache convention and configuration, and all the external apps bundled with pinax.  This can be a tall order.

Still, I got my site up and running, with much more integrated functionality than I would have had with django alone.  I learned alot about pinax, django, and apache, and should have less trouble on my next site.  And I get the desired ‘oohs’ and ‘ahhs’ and ‘damn, how long did it take you to do this?’ from people in the demo.

The next step is a bit more fun.  Now that I’ve got a basic feel for the technology, I can really think about how to design the site and workflows to best serve the users.

, , , , , , , , , , ,

8 Comments

Towards Evaluating Social Media for Scholarly Communication

There are A LOT of people and organizations that are looking at ways of using modern web technologies (web2.0, social media, collaboration, and other buzzwords as well) to enhace the creation, modification, and dissemination of their research and other scholarly work.  There’s even a conference going on right now discussing the matter (see some of the conference discussion on FriendFeed).  And it seems like every day there are a host of new tools, start ups, web sites to enhance collaboration, sharing, and communication between scientists.

There were so many web sites evolving so quickly, there was a call to figure out how to critically evaluate, compare, and contrast the tools.  One way to look at all the sites and tools is to examine how well they achieve the core goals of scientific communication * :

  • Registration of a new idea or claim to an individual or group of collaborators
  • Certification / peer-review of a claim
  • Awareness / access to the details of the claim
  • Archival of the claim
  • Reward for the registrant(s)

Imagine we could assign each site or tool a score along each of these goals.  We could then plot the cumulative score on a radar graph like this:

Radar Graph Visualization of Social Media for Scholarly Communication

This type of graph can help decision makers visualize how well different systems fulfill different goals of scholarly communication, how they are lacking, and overall what are the opportunities for development of future tools.

Note that the scores in the image are not at all rigorously determined.  I made up some quick estimates for a few sites, and compared them to made-up estimates for publication in a high impact journal such as Nature.  I made up the estimates based on the following loose criteria:

Registration: A contribution or claim can be attributed to an individual or a set of contributors, with a creation date time stamp and revision history.

Certification: A contribution can be rated by others.  There can be a few, influential raters (eg editorial board) or many (eg crowd sourcing/collaborative filtering).  Ratings can be anonymous or attributed.  Ratings can be meta-rated (eg the Slashdot moderation system).  Ratings can be simple thumbs up/thumbs down, or with comments/feedback.

Awareness/Access: Users can identify new contributions, as well as contributions that are relevant to their interests.  Awareness tools can range from passive (user must browse/search) to active (system recommendations based on clustering or collaborative filtering).  Entery and metadata for contributions are queriable/accessiblly by a publically documented API, open standard, or format.

Archival: A contribution can be identified and accessed by a single URI (possibly with multiple resource URLs — see Fielding’s thesis on REST).  Association to similar data via metadata are present.  Contributions are exportable into a documented open standard or format . Contributions will be available at the URI for the forseeable future.

Reward: A contribution counts toward professional career advancement, or standing within the academic discipline. (Obviously, social media is currently lacking in this area).

These criteria attempt to combine the traditional requirements for scholarly communication, with modern needs/expectations of web2.0 technologies and open science.  There is certainly room for refinement, and I’d welcome comments from the peanut gallery.

Hmm…maybe there’s room for a collaborative filtering type tool for science web2.0 tools.  People can rate sites on each of these, and other interesting criteria…

* See Roosendaal, H., & Geurts, P. A. T. M. (1998). Forces and functions in scientific communication. In . Retrieved July 25, 2008, from http://www.physik.uni-oldenburg.de/conferences/crisp97/roosendaal.html.  Also Van de Sompel, H., Payette, S., Erickson, J., Lagoze, C., & Warner, S. (2004). Rethinking Scholarly Communication: Building the System that Scholars Deserve. D-Lib Magazine, 10(9). Retrieved August 12, 2008, from http://www.dlib.org/dlib/september04/vandesompel/09vandesompel.html.

, ,

Leave a Comment

Follow

Get every new post delivered to your Inbox.

Join 532 other followers