Lets suppose you are a DBA at a large company. You have some great developers, and they’re learning all about how to turn on full logging of their code through the 10046 database trace. They just learned how to use this data in summary form to find out – at a very detailed level – what’s REALLY taking up all the time during their big batch program which runs too long. They’re salivating over this trace data – but you work for a big company with security policies that can’t be easily changed, where developers rarely get any kind of shell-level or filesystem-level access to a database server. You WANT them to have the ability to profile their own database code… but every time they run a trace, you get dragged into a long email exchange to locate their tracefile and transfer it to a network drive where they can access it. We’re so close to a great situation… but this last part is such a drag!!!
However: please notice the Security Addendum at the end of this article.
Of course if you’re lucky enough that your developers use certain tools from Oracle, then there are some slick 3rd party plugins that will help download and manage tracefiles for you. But what if your developers don’t want to add a whole new environment to their already memory-bound workstations? What if there are corporate policies making this difficult – such as a time-consuming review and approval process for any new software installs?
Wouldn’t it be nice of Oracle had a system view that your developers could just QUERY to find tracefiles? Maybe something like this (which of course displayed data in real time, reflecting the current status of the filesystem):
SQL> desc all_trace_files Name Null? Type ----------------------------- -------- -------------------- DIRECTORY CHAR(5) FILE_NAME VARCHAR2(400) FILE_TIME DATE FILE_BYTES NUMBER FILE_TRACEID VARCHAR2(100) FIRST_TIME DATE FIRST_ACTION VARCHAR2(100) FIRST_MODULE VARCHAR2(100) FIRST_SERVICE VARCHAR2(100) SID NUMBER SERIAL NUMBER ERROR VARCHAR2(400)
Sure, this looks nice – maybe in the next version of Oracle. When you upgrade to it, 10 years from now. But what about NOW – your old 10g database running on a huge mainframe?
We all know how nice it would be if your developers run a query like this… today… on almost any database:
SQL> select directory, file_name, file_traceid, first_action from all_trace_files 2 where first_action is not null; DIREC FILE_NAME FILE_TRACE FIRST_ACTION ----- ---------------------------------------- ---------- ---------------------------------------- UDUMP ardentp1_ora_12386502.trc Test Window - New UDUMP ardentp1_ora_22347974.trc Test Window - New UDUMP ardentp1_ora_22675498.trc SQL Window - xml_file_writing_te UDUMP ardentp1_ora_24248536.trc SQL Window - test_cases.sql UDUMP ardentp1_ora_24641726.trc DEQ UDUMP ardentp1_ora_2621852.trc Test Window - New UDUMP ardentp1_ora_3342766.trc problemTbsp UDUMP ardentp1_ora_4063428.trc SQL Window - New UDUMP ardentp1_ora_45350944.trc Debug Test Window - New UDUMP ardentp1_ora_45547538.trc Main session UDUMP ardentp1_ora_48103606.trc SQL Window - SELECT * FROM ext.e UDUMP ardentp1_ora_50593842.trc SQL Window - test_cases.sql UDUMP ardentp1_ora_50790578.trc SQL Window - New UDUMP ardentp1_ora_51380240.trc Debug Test Window - New UDUMP ardentp1_ora_52429026.trc UserBlock UDUMP ardentp1_ora_52691004.trc SQL Window - New UDUMP ardentp1_ora_60817584.trc Test Window - New UDUMP ardentp1_ora_61538330.trc Test Window - New UDUMP ardentp1_ora_62783512_schn_02.trc schn_02 Test Window - New UDUMP ardentp1_ora_62783512_schn_03.trc schn_03 Test Window - New UDUMP ardentp1_ora_6291778.trc Test Window - New UDUMP ardentp1_ora_63176754.trc Test Window - New UDUMP ardentp1_ora_66388158.trc Test Window - New UDUMP ardentp1_ora_8192480.trc Test Window - New UDUMP ardentp1_ora_8651144.trc SQL Window - test_cases.sql BDUMP ardentp1_m000_12517418.trc Monitor Tablespace Thresholds BDUMP ardentp1_m000_20119784.trc Monitor Tablespace Thresholds BDUMP ardentp1_m000_24248408.trc Monitor Tablespace Thresholds BDUMP ardentp1_m000_45154330.trc Monitor Tablespace Thresholds BDUMP ardentp1_mmnl_52560064.trc Monitor Tablespace Thresholds BDUMP ardentp1_mmon_52756580.trc Monitor Tablespace Thresholds 31 rows selected.
And of course – last but not least, you’d like your developers to actually be able to download the tracefile themselves, right? Actually, this has already been done – Dion Cho blogged some sample code about two and a half years ago. He has a very elegant solution using a pipelined PL/SQL function and UTL_FILE.
Just one minor tweak would be needed, to grab files from more than one directory. (In addition to your session traces, you might need to get logs from parallel query slaves or DBMS_JOB/SCHEDULER processes.)
Not only can you write queries to your heart’s content on this, but just about every developer environment on the planet can easily take this query output and save it to a file – then they can run the profiler of their choice on the tracefile. That would be cool, right?!
SQL> select * from table(textfile('ardentp1_ora_52691004.trc','UDUMP')) where rownum<20; COLUMN_VALUE ------------------------------------------------------------------------------------------- /u0001/app/oracle/admin/ardentp/udump/ardentp1_ora_52691004.trc Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options ORACLE_HOME = /u0001/app/oracle/product/db/10.2.0/ardentp System name: AIX Node name: ardent4 Release: 1 Version: 6 Machine: 00F66BE22C00 Instance name: ardentp1 Redo thread mounted by this instance: 1 Oracle process number: 59 Unix process pid: 52691004, image: oracle@ardent4 *** ACTION NAME:(SQL Window - New) 2011-08-15 16:35:24.791 *** MODULE NAME:(PL/SQL Developer) 2011-08-15 16:35:24.791 *** SERVICE NAME:(ardentp_dba) 2011-08-15 16:35:24.791 *** SESSION ID:(572.24734) 2011-08-15 16:35:24.791 19 rows selected.
Well I AGREE that this would be cool! And as it happens, the customer I’m working with now would find it very useful. And honestly… it’s really not that complicated to code something like this. So last night I stayed late a few hours to write it up and test it. It’s mainly build on a pipelined java function, and of course a minor tweak on Dion’s code for getting the file.
Basically, every time you query the view it scans the first 30 lines of all files in the bdump and udump directories for timestamps and identification information (module, action, service, sid, serial). This information is combined with basic file stats (name, size, last-modified-date) and the tracefile_identifier is calculated from the filename. Any errors encountered are reported in the error column (file permissions, etc).
You can download the entire thing at my github repo (specifically, the files tracefile_find.txt and tracefile_get.txt). All examples in this blog post are from a real system – nothing here is cooked, except that I changed database and host names to protect the innocent.
As pointed out in the comments after I published this post, there are some important security considerations with allowing access to trace files. A good explanation can be found in Pete Finnigan’s blog article: Is it possible to steal data with just ALTER SESSION? To summarize: if a clever hacker is able to access tracefiles then it is possible for them to see things that the database normally protects. This certainly includes table data, and it may even include passwords for other database accounts – sometimes even in cleartext. This is very important if you’re considering any tool that grants access to tracefiles – including the scripts I’ve published here. Although he generally disapproves of any tracefile access, Pete’s comment suggested a way to reduce the risk. The second table (which actually gives tracefile contents) can be limited to DBAs and then developers can get a view which only shows files that the DBAs explicitly grant. I think that the most useful part of my program is the first view anyway, which scans tracefiles and lets you use SQL to search on module/action/service.
One further quick postscript… lest you think I was terribly irresponsible with this client, we did consider security implications before endorsing this script. Admittedly I didn’t consider every scenario that Pete has written about, but I do still think we made an appropriate decision. (Also… the script wasn’t yet installed into production when I left the client, because it had to go through an approval process itself. ) In this particular case, the small group of developers getting access to tracefiles were people who already had full access to all data in the database and they had passwords for all key application accounts. Although they didn’t have SYSDBA passwords or shell access, they essentially had everything else – including all the data. Furthermore, they were in a stage of performance troubleshooting where their productivity could be significantly increased with the ability to run many traces and quickly access the tracefiles (think Method R). And finally, these developers were on the same small, tight team as the DBAs who took care of backups and patching, all sitting together in one corner of the office building. The DBAs were involved in the decision and seemed to find their reduced workload a very positive thing – especially since they know all the developers who would get access.
One could certainly comment on the security policies of this client, and their decisions about balancing trust and productivity. But I do not think that we were circumventing the company security policies in this case. And certainly my scripts are not intended to support such activity. Nonetheless, I do think there are cases where this is a very useful script. It’s much more convenient for developers who – even if they can get shell access – may not know much about navigating unix and are more comfortable working in their SQL development environment.
Neat solution, I like the idea of scanning the top of each trace file to get info. Can’t wait to set it up and poke one of my trace savvy developers to give it a whirl.
Whilst the code looks nice and the view contents look nice and i like the idea of a simple life my day job is security and this looks like a very bad idea for security.
You mention at the start “companies have strict policies…. ” and it seems like this is a solution to bypass security policies that are there for a good reason.
Most companies that i work with dont allow developers in production; in some cases I find violations of this during audits; some minor some absolutely major. Having developers in prod per-se is probably not that bad on the scale of things; they are valid employees, they are doing a valid job… the real issue for me is two-fold.
1) They have the knowledge often to get what they want, they know how things work, they know where the good stuff is, the dangter is what they can see if they are allowed in there
2) The second major problem is “what” they do when in production; making changes outside of change control; changing data in the goal of restoring a failed system….
These are the reasons companies have security policies.
Take a look at http://www.petefinnigan.com/weblog/archives/00001234.htm and some of the links off it to see the dangers of allowing remote access to trace files.
All of that said developers sometimes do need trace taken against prod as sometimes thats the only way to get the right traces. if you really need to expose trace then do it in a controlled way. maybe create the views and code as SYS and they allow access to approved trace simply by creating a user view based on the core view that only allows access tpo approved files. You could do this based on a lookup table/flag system. The DBA can check the trace through the view, make sure its genuine and add a flag to allow the end user (developer) to see it. Bear in mind though that this is a compromise and not ideal; my view is that developers should not have access to production. A solution based on limited access (time, who, location, why ) would be better
Pete – thanks for the insightful comment. Seemed to me that it merited more than just a brief response so I have updated the actual blog post with a “security addendum”.