dvn
LOGS
19:01:38 <pdurbin> #startmeeting
19:01:38 <zodbot> Meeting started Wed Feb 13 19:01:38 2013 UTC.  The chair is pdurbin. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:38 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
19:01:49 <pdurbin> #topic intro
19:02:21 <pdurbin> hello and welcome! I've called this meeting to talk about integrating Shibboleth into the Dataverse Network (DVN)
19:03:18 <pdurbin> my web page is http://www.iq.harvard.edu/people/philip-durbin and i own the ticket we're using to track the integration: https://redmine.hmdc.harvard.edu/issues/2657
19:03:53 <pdurbin> here's the agenda:
19:04:05 <pdurbin> - summarizing how authentication works in a Dataverse Network today
19:04:15 <pdurbin> - reaching a common understanding of what Shibboleth is
19:04:27 <pdurbin> - roughing out a plan for how to integrate Shibboleth into the DVN
19:04:41 <pdurbin> - discussing implications of Shibboleth integration
19:04:58 <pdurbin> everyone should feel free to jump in at any time
19:05:23 <pdurbin> you're welcome to link to a page about yourself or otherwise say who you are like I did above :)
19:05:40 <pdurbin> #topic summarizing how authentication works in a Dataverse Network today
19:05:55 <pdurbin> let's use https://dvn-demo.iq.harvard.edu/dvn/ as an example. It's running DVN 3.3
19:06:13 <pdurbin> there are two links at the top right: "Create Account" and "Log In"
19:06:30 <pdurbin> "Create Account" takes you to pretty standard form asking for a username, password, email, etc.
19:06:46 <pdurbin> after filling out the form you're logged in
19:07:02 <pdurbin> you can create a dataverse, create studies, upload files, etc.
19:07:32 <pdurbin> the user account you created is stored locally, in a PostgreSQL database
19:07:50 <pdurbin> all in all, i think it's a pretty standard way for a webapp to behave
19:08:12 <pdurbin> if there are any questions or comments about this, please go ahead
19:09:08 <pdurbin> I *could* go into how at least for the DVN at https://dvn.iq.harvard.edu/dvn/ we have the concept of being a Harvard Affiliate...
19:10:04 <pdurbin> which is based on IP addresses or logging in through what we at Harvard call "PIN auth"
19:10:45 <pdurbin> ... but I think I'll go ahead and talk some more about Shibboleth :)
19:11:03 <pdurbin> #topic reaching a common understanding of what Shibboleth is
19:11:18 <pdurbin> according to http://shibboleth.net "Shibboleth is an open-source project that provides Single Sign-On capabilities and allows sites to make informed authorization decisions for individual access of protected online resources in a privacy-preserving manner"
19:11:35 <pdurbin> http://shibboleth.net/about/basic.html has more about the main "actors" in Shibboleth. I want to focus on the Identity Provider (IdP) vs. the Service Provider (SP)
19:12:01 <pdurbin> the Identity Provider (IdP) authenticates the user. http://idp.testshib.org is an example IdP. It stores all the usernames and passwords (and other info like email address, etc.)
19:12:13 <pdurbin> Jon__: as I understand it http://its.unc.edu/service-catalog/shibboleth/ describes the IdP that Odum would use
19:12:32 <pdurbin> here's the official write up of what an IdP is: http://shibboleth.net/products/identity-provider.html
19:12:59 <pdurbin> a Service Provider (SP) registers with an IdP to... provide a service
19:13:18 <pdurbin> the official write up is at http://shibboleth.net/products/service-provider.html but I'd like to move on to a real-world example I cooked up
19:13:46 <pdurbin> I've taken a server called "dvn-vm2" and have made it a Service Provider (SP). The service it's providing is access to a protected area Shibboleth people call a "resource"
19:14:07 <pdurbin> Let's say we've licensed STATA and want to let students download it after they've logged in to Shibboleth.
19:14:32 <pdurbin> Back to the server... I've registered it as a Service Provider (SP) with with the Identity Provider (IdP) from testshib.org
19:14:45 <pdurbin> If you click https://dvn-vm2.hmdc.harvard.edu/secure/ you should be redirected to a login page at https://idp.testshib.org/idp/Authn/UserPassword
19:14:59 <pdurbin> you can go ahead and try it :)
19:15:41 <pdurbin> you should see some usernames and passwords listed... such as myself/myself or alterego/alterego
19:15:53 <pdurbin> After login you should be taken back to https://dvn-vm2.hmdc.harvard.edu/secure/ and see "Secure area"
19:16:33 <pdurbin> let me pause for a minute and ask if anyone here is able to successfully log in to https://dvn-vm2.hmdc.harvard.edu/secure/ ?
19:16:46 <sbmarks> yep!
19:16:56 <pdurbin> sbmarks: awesome. thanks
19:17:18 <pdurbin> so you get a sense of what's happening :)
19:17:23 <pdurbin> https://github.com/dvn/shibpoc contains all the configuration for how I set up dvn-vm2
19:17:36 <pdurbin> If you jump down to https://github.com/dvn/shibpoc#http-headers-from-login-test you'll see the various URLs your browser hits as you log in ... idp/profile/SAML2/Redirect/SSO?SAMLRequest... /idp/AuthnEngine ... /idp/Authn/UserPassword ... /idp/profile/SAML2/Redirect/SSO ... /Shibboleth.sso/SAML2/POST etc.
19:18:07 <pdurbin> marlena: i'm sure you understand all those URLs better than i do :)
19:18:53 <pdurbin> so the thing we're protecting is /secure
19:19:00 <pdurbin> and everything underneath it
19:19:14 <pdurbin> on disk /secure is /var/www/html/secure
19:19:33 <pdurbin> and you can imagine goodies in there like /var/www/html/secure/STATA-installer.exe
19:19:58 <pdurbin> licensed software, that is, that we want to protect from the public
19:20:34 <pdurbin> we only want students to download it. after they have authenticated via Shibboleth, via our IdP
19:21:13 <pdurbin> again, I've registered the the Service Provider (that dvn-vm2 server) with a specific IdP at testshib.org
19:22:11 <pdurbin> I think of this as the "hello world" of Shibboleth... it's a pretty basic use case really... trying to protect some files from being downloaded by the public
19:22:27 <pdurbin> does this make sense to people?
19:22:39 <sbmarks> yup. makes sense!
19:22:56 <pdurbin> sbmarks: cool
19:23:05 <pdurbin> anyone else out there? :)
19:23:22 <marlena> Sure it makes sense -- but shibb is ignorant of what you are trying to protect.
19:23:23 <bobtreacy> I'm lurking
19:23:29 <gdurand> same here
19:23:36 <pdurbin> heh
19:23:45 <pdurbin> marlena: well.. i was wondering that too
19:23:54 <marlena> By the time you are ready to make an authorization decisions -- for downloading files or whatever, shibb should be out of the picture.
19:24:03 <pdurbin> how does shibboleth know that it's supposed to protect /secure and not some other area
19:24:25 <marlena> That has to do with your httpd configuration.
19:24:41 <pdurbin> i actually created http://dvn-vm2.hmdc.harvard.edu/open/ to say "Wide open area" and it *doesn't* require any shibboleth auth
19:24:51 <pdurbin> marlena: yes! exactly
19:25:29 <pdurbin> there is an Apache config file that ships with the shibboleth RPM: /etc/httpd/conf.d/shib.conf
19:26:31 <pdurbin> it has standard apache stanzas... looks like this: <Location /secure> AuthType shibboleth ... </Location>
19:27:03 <pdurbin> i put the whole stanza at https://github.com/dvn/shibpoc
19:28:34 <pdurbin> #info https://dvn-vm2.hmdc.harvard.edu/secure/ example show protecting files from download
19:28:57 <pdurbin> #topic roughing out a plan for how to integrate Shibboleth into the DVN
19:29:12 <pdurbin> Unlike the "hello world" dvn-vm2 example above with a "secure" area for software downloads, DVN is a full Java EE web application (running on Glassfish)... the identity of users is fundamental to DVN's operation
19:29:41 <pdurbin> I think we can all agree on this :)
19:29:47 <sbmarks> seems fair
19:30:24 <pdurbin> every dataverse network has users. users create dataverses. and permissions are granted on studies within those dataverses
19:30:41 <pdurbin> each dataverse is kind of a world of access control :)
19:30:57 <pdurbin> i mean, some people choose to have everything public
19:31:15 <pdurbin> but many users restrict access to their studies and data to certain other users
19:32:16 <pdurbin> also, the dvn-vm2 example was all or nothing... either you have access to that /secure area or you don't
19:32:41 <marlena> That's not strictly correct.
19:32:45 <pdurbin> with the DVN your view of a page might be different depending on whether you're logged in or not
19:33:01 <pdurbin> marlena: no?
19:33:30 <marlena> Any time a user does a GET you the app get to do an authorzation decision  -- presumably based on a cookie you've sent (after you've gotten attributes via the Shibb authentication step).
19:34:31 <pdurbin> hmm, ok. sounds good. i don't feel like i'm making any decisions in my hello world dvn-vm2 example though. but i guess i could somehow
19:35:30 <marlena> If you haven't set a cookie or the cookie is expired you can redirect the user to a shibb-protected url -- and then get attributes for them.
19:35:43 <marlena> That's one way of doing things any way  :-).
19:35:49 <pdurbin> :)
19:35:50 <pdurbin> ok
19:36:06 <marlena> It's actually a fairly typical way AFAIK.
19:36:34 <pdurbin> well, the point i'm driving at is that auth is fundamental to the DVN so the integration with shibboleth needs to be complete... it needs to be deep
19:37:25 <marlena> Yes to "complete."
19:37:28 <pdurbin> and we need to make sure the shibboleth integration is a toggle switch... most DVN installations don't have a local Shibboleth Identity Provider (IdP) to point at
19:38:18 <marlena> I think the idea of "local idp" is a bit off-base.
19:38:36 <marlena> The whole point is that the user logs in at their home (remote) institution.
19:38:36 <pdurbin> is it?
19:38:43 <marlena> At least that was the original point :-).
19:39:02 <pdurbin> ok, let's same "home institution" :)
19:39:15 <pdurbin> whoops. let's say "home institution" :)
19:40:00 <pdurbin> anyway, at the end of the day we need to code this up in Java somehow and right now I'm seeing three different approaches we could take
19:40:06 <pdurbin> three different directions
19:40:20 <pdurbin> and maybe we'll try a couple and see which works best
19:40:48 <pdurbin> let's call the first option "fronting Glassfish with Apache"
19:40:51 <marlena> Here's one way: When the user hits your authentication page, they can pick "local login" if they want to use their existing dataverse name/pwd or the pick from a list of IdPs that the Dataverse talks to.
19:41:15 <sbmarks> one thing we want to flag for now or later: we are a consortium and would have to rely on multiple IdPs
19:41:28 <sbmarks> ah k
19:42:02 <sbmarks> a wayf page
19:42:15 <sbmarks> (i have expert help sitting next to me)
19:42:38 <marlena> In essence -- except that it also allows local login.
19:42:42 <bobtreacy> Does anybody know, is there something shib provides that is different from other SAML implementations?
19:42:44 <sbmarks> cool
19:43:05 <marlena> What the shibb code provides that SAML doesn't is...
19:43:29 <marlena> ....a fairly rich set of attribute acquistion and mapping facilities both on the IdP and SP side.
19:44:22 <marlena> This let's the IdP send attributes other than "name" -- and lets the SP (via configuration) change the attributes it gets into something your app knows how to consume.
19:46:50 <marlena> Phil: please resume your set of approaches.  Sorry about the interrupt :-).
19:46:54 <pdurbin> gdurand: at a meeting the other week we talked a bit about local logins vs. login via an IdP... I feel like we were thinking that if a DVN installation moved to IdP logins that they would no longer be able to use local login
19:47:26 <pdurbin> sbmarks: or multiple IdPs
19:47:50 <marlena> If you let the user pick "local loging" or their home institution (assuing it's in a list of IdPs you support), then you can do both.
19:48:16 <pdurbin> right but it's a design decision
19:48:58 <marlena> If you have user's who are at institutions that don't have IdPs, aren't you kind of pushed into doing support for both types of login?  (Just sayin' :-).)
19:49:29 <pdurbin> well, yes, but not necessarily at the same time, if that makes sense
19:49:46 <pdurbin> I left a note to myself to pick up this thread about local logins later
19:49:50 <marlena> Don't get me wrong: some apps force Shibb/SAML authentication and don't have the notion of local login (even if they did in the past).
19:49:57 <sbmarks> amaz says: you're absolutely right. It comes down to what you'd like to support. You currently have local users... you want to support idP logins... you may want to support linking an existing account to a shib account...
19:50:23 <marlena> Exactly.
19:50:29 <pdurbin> sbmarks: yes, exactly. we were thinking about a link
19:50:40 <pdurbin> all DVN installations now have local accounts only
19:50:41 <sbmarks> excellent
19:50:45 <gdurand> I think they would still need local login. For example are DVN. If we allow authentication for Harvard folks via IDP, we still want all other current uers to be able to log in.
19:51:04 <pdurbin> yes
19:51:27 <gdurand> It's just that if they had created the account via IDP, then they would not have a way to login locally (no stored password).
19:51:44 <marlena> When I talked to Merce & Phil & Gustavo a few weeks ago, we talked about linking the Shib-provided identity to a IQVerse-held identity -- so that they wouldn't have to muck with there authorization system.
19:52:04 <marlena> Why would they need to login locally?
19:52:46 <pdurbin> marlena: social scientists anywhere in the world are welcome to create accounts at https://dvn.iq.harvard.edu/dvn/ and start uploading data
19:53:20 <marlena> Well, you encourage them to create their account via authN via shibb, instead of getting a new name/pwd.
19:53:35 <gdurand> In our DVN, for example, we allow anyone to create an account. So some people are Harvard affiliates and could use their Harvard credentials, but others are from anywhere else and still need the ability to create accounts. log in, etc.
19:53:44 <pdurbin> so in the future, if Harvard has an IdP we can point that DVN installation at... we would need to support login via IdP and local logins for non-Harvard people
19:54:14 <marlena> Um, what about users who have IdPs at their non-Harvard institutions?
19:54:16 <sbmarks> marlena: if an SP (who is a consortia) wants to provide access to students across their consortia, but not all consortia members have idPs.?
19:54:47 <sbmarks> you would perhaps provide a "WAYF" page
19:54:51 <marlena> Then they allow local logins and Shib-enabled authentication.
19:55:15 <marlena> Yes, via a WAYF-type page. "Where Are You From"
19:55:44 <pdurbin> marlena: thanks. not everyone knows what WAYF means, i'm sure
19:56:06 <marlena> An alternate is to have the user get an account at "protect net" -- which serves as an IdP.
19:56:26 <sbmarks> amaz says: my bad!
19:56:30 <marlena> I had to do this so I could authenticate to an InCommon site (because Harvard's IdP (which I'm standing up) isn't yet ready).
19:57:28 <marlena> I think this is not a bad way to go -- and is worth exploring  i.e. have users get a protect net account instead of a local account.
19:57:49 <pdurbin> marlena: so you seem to be saying that if a DVN installation chooses to use an IdP, we shouldn't allow local logins at all. that people who don't have an account with the home institution IdP should still log in via *some* IdP such as http://www.protectnetwork.org
19:58:10 <marlena> No. No. No.
19:58:15 <pdurbin> ok :)
19:58:16 <marlena> I'm not saying that..
19:58:25 <marlena> I'm providing a set of possibilities.
19:59:03 <pdurbin> ok. well i think well need to support both local logins and login via an IdP for some DVN installations
19:59:03 <marlena> I think it's fine to provide local accounts if you want to continue to do that.   And it might be fine to ask users to get an protect net account instead.
19:59:22 <pdurbin> i don't think protect net accounts are free
19:59:32 <marlena> I didn't pay.
19:59:37 <pdurbin> hmm, ok
19:59:45 <marlena> Like I said, "worth exploring" :-).
19:59:48 <pdurbin> anyway, i'd like to get back to java
19:59:52 <pdurbin> and possible directions
19:59:57 <pdurbin> bobtreacy: and your question :)
20:00:38 <pdurbin> so the first option is "fronting Glassfish with Apache"
20:00:54 <pdurbin> This write up says nothing about mod_shib, but it's probably the best resource about fronting glassfish with Apache: http://weblogs.java.net/blog/amyroh/archive/2012/02/15/running-glassfish-312-apache-http-server
20:01:56 <pdurbin> Some people call this "apache+mod_shib+ajp": http://irclog.perlgeek.de/shibboleth/2013-02-12#i_6444686
20:02:20 <pdurbin> bobtreacy: as you and i were discussing, however, this introduces a dependency on apache
20:02:36 <pdurbin> right now people don't need apache to run DVN. just glassfish
20:02:55 <pdurbin> #info option 1: fronting Glassfish with Apache
20:03:09 <pdurbin> Another option is OpenAM: http://openam.forgerock.org
20:03:19 <pdurbin> OpenAM is the continuation of a (defunct) Sun project called OpenSSO: http://en.wikipedia.org/wiki/OpenSSO
20:03:24 <bobtreacy> Marlena answered it to some extent, although I'd like to understand more what we'd lose with say OpenAM, since that shib guy you were talking to suggested using other SAML implementations, for instance on JBoss and as you say we have discussed using this on glassfish
20:03:50 <pdurbin> #info option 2: OpenAM
20:04:21 <pdurbin> bobtreacy: can you describe your experience with OpenAM?
20:04:23 <marlena> I have to pay attention to another conf call....
20:04:30 <marlena> I'll try to multiplex :-).
20:06:24 <pdurbin> (this guy votes for OpenAM by the way: https://twitter.com/jm2dev/status/301702827431059458 ... he was testing with TestShib )
20:06:53 <pdurbin> (as he wrote about here: http://lists.forgerock.org/pipermail/openam/2012-June/006831.html )
20:07:48 <pdurbin> I'm going to go on about the third and last possible direction: writing our our Service Provider (SP)
20:08:01 <pdurbin> Shibboleth uses a protocol called SAML: http://en.wikipedia.org/wiki/Security_Assertion_Markup_Language
20:08:20 <pdurbin> In this scenario, we would handle the SAML transactions ourselves
20:08:21 <bobtreacy> I've gone through setting up OpenAM on glassfish a while ago when we were thinking about SSO
20:08:36 <pdurbin> bobtreacy: ok
20:08:38 <pdurbin> This was first suggested to me in #glassfish: http://www.evanchooly.com/logs/%23glassfish/2013-02-07
20:08:40 <marlena> Phil: Don't forget the attribute transmogrifications.
20:08:50 <pdurbin> The guy mentioned some sample code at http://code.google.com/p/websso/
20:09:00 <pdurbin> Today in ##shibboleth someone says they wrote their own for node.js: http://irclog.perlgeek.de/shibboleth/2013-02-13#i_6447136
20:09:12 <pdurbin> In both cases (glassfish and node.js) the underlying library used is OpenSAML: http://opensaml.org
20:09:24 <pdurbin> #info option 3: write our own Service Provider (SP) with OpenSAML
20:10:08 <pdurbin> so those are the three options I see: 1. fronting glassfish with apache, 2. OpenAM 3. using OpenSAML and handling the SAML back and forth ourselves
20:11:22 <pdurbin> rather than trying to get any of these 3 options right into the DVN, I thought a better approach would be to add them to this simple template app: https://github.com/IQSS/iqss-javaee-template
20:11:39 <bobtreacy> Is marlena's shib going to be available to use, rather than setting up our own apache?
20:12:05 <pdurbin> which i've deployed publicly to http://dvn-vm2.hmdc.harvard.edu:8080/hello1/ so i can hopefully register it with the IdP at https://www.testshib.org
20:12:36 <pdurbin> bobtreacy: i don't know. https://www.testshib.org is out there. i *think* we can test with that
20:13:42 <pdurbin> there are also IdP VMs we can download and test with maybe: http://irclog.perlgeek.de/shibboleth/2013-02-12#i_6444019
20:14:23 <pdurbin> and http://simplesamlphp.org might be a suitable IdP we could stand up ourselves for testing: http://irclog.perlgeek.de/shibboleth/2013-02-12#i_6444004
20:15:00 <pdurbin> I'm sure we can only go so far with https://www.testshib.org since we have no control over it, but it's not a bad place to start, I think
20:15:24 <pdurbin> marlena: any word on if we can test with your IdP?
20:16:06 <marlena> Not before a month.
20:16:23 <pdurbin> marlena: ok. no problem. we'll use testshib for now. thanks
20:16:24 <marlena> We could do quick tests before then.
20:16:33 <pdurbin> ok
20:16:42 <marlena> I.e. you and I would need to coordinate to make sure the IdP is up.
20:16:53 <pdurbin> right. makes sense
20:18:22 <pdurbin> so unless there are any objections, the next step I'll take is starting to add some basic auth to https://github.com/IQSS/iqss-javaee-template and once that's up, switch it over to OpenAM
20:19:03 <pdurbin> i plan to look at  http://jsfcompref.com (JavaServer Faces 2.0: The Complete Reference) for guidance on coding up some basic auth (or sample code we have in house)
20:19:20 <bobtreacy> btw, looking at http://shibboleth.net/products/identity-provider.html it says supported container Tomcat 6 - glassfish web-tier is tomcat
20:19:45 <pdurbin> bobtreacy: are you thinking we should run our own IdP?
20:19:56 <pdurbin> the DVN team, I mean, for testing?
20:21:11 <marlena> Question: What would your own IdP give you that testshib doesn't?  (I haven't looked into testshib.)
20:21:22 <gdurand> it seems to start we can try to use what's out there - first testshib, then Marlen'as
20:21:34 <bobtreacy> ok
20:22:22 <pdurbin> #agreed use testshib.org first, then Harvard's test IdP when available
20:22:51 <marlena> If you want to set up your own IdP, feel free -- I'm just wondering what's the bang for the buck.  (It might be significant.)
20:22:53 <pdurbin> marlena: our own (or your) IdP would give us control
20:22:58 <marlena> Over what?
20:23:19 <pdurbin> i assume that when installers of the DVN go to turn on shib auth they'll need to coordinate with their home institutions IdP provider
20:24:08 <marlena> Um, not more than they'd coordinate with any IdP they want to deal with.
20:24:11 <pdurbin> hopefully we can get pretty far with a public resource like testshib.org
20:24:32 <pdurbin> #action pdurbin to add basic, non-shib auth to iqss-javaee-template and later OpenAM for testing with testshib.org
20:25:06 <pdurbin> any more action items for this topic? next up is "discussing implications of Shibboleth integration"
20:26:10 <pdurbin> (which we've kind of discussed already)
20:26:21 <pdurbin> #topic discussing implications of Shibboleth integration
20:27:07 <pdurbin> #idea make sure we can support multiple IdPs
20:27:43 <pdurbin> #info shib-enabled DVNs will probably still need local login as well
20:29:03 <pdurbin> #link http://irclog.iq.harvard.edu/dvn/2013-02-13#i_855 discussion of local login and other implications of enabling Shibboleth in a DVN
20:29:28 <marlena> Here's one:  Decide on whether you want your IdPs to be part of InCommon.  (Reason: That way you get their metadata in a standard feed.)
20:30:08 <pdurbin> marlena: makes sense
20:31:05 <pdurbin> I think we've covered a lot of good ground today... I'm pretty much ready to wrap up
20:31:28 <pdurbin> The conversation can continue in this channel at any time as far as I'm concerned
20:31:36 <pdurbin> people are welcome to pop in with an idea
20:32:14 <pdurbin> are we done?
20:32:47 <marlena> Thanks, Phil.  Bye for now.
20:32:58 <gdurand> Thanks, Phil.
20:33:12 <pdurbin> thanks all!
20:33:16 <pdurbin> #endmeeting