All
you ever wanted to know about
exp and imp
(but were afraid to ask).
Copyright
Oriole Corporation, 1999
Moving
data out of Oracle and back in
exp
and imp are probably the most important utilities for Oracle DBAs but,
unfortunately, they are far from being the last word in efficiency. Some of
the tools on sale on the Oriole website
were developed especially to overcome their shortcomings and provide much more
efficient solutions.
Nevertheless, exp and imp have a number of uses and certainly
are quite sufficient for small volume databases (the numbers of which are dwindling)
and you need to have a thorough understanding of what you can do with them -
and you can do a lot - before looking for more innovative solutions.
Our purpose here is not to replace Oracle's Database Utilities User's Guide
but to explain how it works rather than how to make it work, and to point out
to interesting features a beginner might overlook.
Exporting
data
What
exp does
exp
is a program which takes both data and structures out of an Oracle database
and writes them to a file. The resulting file is binary (think about it when
using ftp, even if you have not compressed it), its format is a proprietary
Oracle format and it can only be read by the imp utility. You cannot
use it to transfer data to something other than Oracle - if you want to transfer
a small amount of data to reimport into Excel, for instance, the best you can
do is to use SQL*Plus and its spool command to save the result of a
query to a text file. For huge volumes of data you should either write your
own extractor or use a commercially available one such as Oriole's
pdqout
or competitive products (how dare they ?).
How
it works
exp
allows you to dump either the full database, or objects which belong to a list
of users, or a list of tables. What exp precisely extracts from the database
can be controlled by options (exp help=y gives you the full list);
what we are referring to hereafter is the default behavior.
When exp exports a table, it writes to the file
- an
instruction to become the owner of the table
- the
full text of the CREATE TABLE statement, complete with tablespace
name and STORAGE clause. Note that by default exp will try
to compress the table, which is not compression but the CREATE
statement will cause the creation of the table in a single chunk (extent
in Oracle-speak) if it was fragmented. It has nothing to do with how much
storage is really used. If a temporary work table grew and became three 1Mb
pieces before having all its rows deleted, exp will generate a statement
requesting creation of a 3Mb, single-piece table - even if it's now empty.
- the
data. Do not expect to see it as returned by a SELECT : everything
is written in the Oracle internal format (numbers and dates are illegible);
it is organized differently but it is very close to what the datafile contains.
- statements
to create indexes and non-referential constraints (such as primary keys).
- referential
constraints.
- GRANTs
on the table to other users.
If
you export an owner, exp will write to the file not only what is given
above for each table (in alphabetical order), but also stored procedures and
packages created by the user, as well as views, sequences, synonyms ...
If you export the full database, it will write statements to recreate all tablespaces
but SYSTEM, to recreate rollback segments, roles, users, and everything
owned by each user.
Useful
things to remember
- When
using exp, always keep in mind that imp can always reload a
subset of what was exported, but not a superset ! So, if in doubt, export
rather more than what you think is really needed.
- If
you are using exp with Oracle 7.3 or above, always use option DIRECT=Y
which is really much faster than the default 'conventional path'.
- With
option ROWS=N exp allows you to export all the DDL statements
to a file - a binary one, not an editable one unfortunately - without the
data. For a negligible increase in processing time, it can be a good idea
when exporting data to do two successive, identical exports, one with the
data, the other without.
The reason is that to successfully reimport everything (see below) you must
sometimes reapply at least some of the DDL statements after having reloaded
the data. You can ask imp only to apply the DDL statements without
reloading the data in the file; however, it then has to read the whole file,
which may be very big, and to skip the data. A 'DDL only' pass will run much
faster using a file which contains DDL only.
- Beware
of compatibility between different Oracle releases (see below). A file exported
with exp from an earlier release can be imported in a later release
but the reverse is not true since new features and syntax may have been incorporated
in the later release.
- There
is an 'incremental export' feature. Forget about it, it is not worth the trouble.
In incremental mode, only tables the contents of which have changed since
the last time are exported - fully, even if a table contains 10 million rows
of which only one was updated. As usually the most updated tables are also
the biggest ones, do not expect an incremental export to run significantly
faster than a basic export; however, restoration is likely to be significantly
more difficult.
- Beware
of the default COMPRESS=Y option. If some tables from pre-partitioning
days were carefully and artistically spread across a number of files on different
disk drives, you certainly do not want this - especially as the monolithic
version may not fit in. The case of work tables is a more subtle one. Assume
a work table has grown to reach a respectable number of extents. If it is
recreated in a single chunk, you will no longer be able to deallocate unused
storage by using a simple TRUNCATE TABLE or ALTER TABLE - you will have to
drop and recreate it, which may not be easy for a number of reasons. Therefore,
do not forget to deallocate unused storage before doing the export.
Troubleshooting
Compared
with imp, exp usually runs pretty smoothly. What you can encounter
when executing it, is that old favorite, ORA-000942 : Table or view does
not exist. The reason is that exp uses a set of tables (chiefly
to deal with incremental exports), normally created when installing the views
on the data dictionary (the catalog) but which are sometimes forgotten for one
reason or another.
Under Unix, the script which creates them can be found under $ORACLE_HOME/rdbms/admin
and is named catexp.sql. It must be run by SYS; usually, you
connect as INTERNAL under svrmgrl to run it.
It is worthwhile to say that it is possible to some extent to export from an
Oracle8 database in order to reload into an Oracle7 database.
To this effect, you must be in an Oracle7 environment and use exp shipped
with Oracle7, connecting through SQL*Net to the Oracle8 database (which may
well be running on the same machine) by using something such as :
exp
system/manager@V8DB owner=SCOTT file=mydump.dmp
However, the tables used by exp have changed between Oracle7 and Oracle8;
to be able to do this, you must first have run catexp7.sql on
your Oracle8 database in the way indicated above for catexp.sql.catexp7.sql
is not run by default when installing the database !
Besides missing tables, the chief problem you can run into is lack of disk space
to write the export file ! Always check beforehand. If you want to estimate
how much is required prior to performing the export, you should sum the number
of bytes in DBA_SEGMENTS for segments the SEGMENT_TYPE of which is TABLE and
which fall in what you want to export (if you are doing a full database export,
don't count tables owned by SYS).
select round(sum(bytes) / 1024 / 1024, -1) "Approx. MB req."
from DBA_SEGMENTS
where segment_type = 'TABLE'
and owner != 'SYS'
and ... suitable condition to get tables to export;
It
will give you a rough but reasonably reliable estimate (all the more as reality
should be less, since one might expect the volume of DDL to be less than the
segment overhead, unless you have a lot of stored procedures).
Importing
data
What imp
does
imp
takes as input a file generated by exp, and attempts to reload into the
database either everything or selected bits.
How it
works
imp
tries to execute DDL statements as it encounters them and, in the case of tables,
to load the data after having created the table. The default behavior is to
skip the data loading if one failed to create the table - either for lack of
space or simply because it already exists.
Useful
things to remember
- As said
above, imp cannot import more than exp has exported ! Beware
that the option FULL=Y, which is available with both exp and
imp, has totally different meanings between one tool and the other
: with exp it means 'export the full database', with imp it
means 'import the full file' - which may well contain the contents of a single
table, or no data at all.
- imp
allows you to change table (and index and stored procedure and ...) ownership
in the process with the FROMUSER=... and TOUSER=... options.
- When
trying to create tables or indexes, imp has a quite peculiar behavior.
It tries to create the object in the tablespace indicated in the DDL statement.
However, if the tablespace doesn't exist or if the object owner has no quota
(is not allowed to create anything there) it will have a second try and will
use the object owner's default tablespace instead. This is sometimes used
to migrate objects from one tablespace to another tablespace - export, change
the user's default setting, and reimport. It can also lead to having a lot
of things created in the wrong place.
- imp
has to create indexes, which means sorting. If you have a very big import
to do, don't forget to massively increase the sort_area_size init.ora
parameter, shutdown and restart the database - and to put the original value
back after the import. It will significantly improve performance by decreasing
the number of I/Os.
- if your
database is in ARCHIVELOG mode, it may be interesting to switch back
to NOARCHIVELOG just for the duration of the import. You are likely
to generate quite a bit of redo log.
- (not
for beginners) and what about playing with NOLOGGING with Oracle8
?
Troubleshooting
Quite obviously
imp uses the same dictionary tables as exp, and therefore what
has been said on the subject above still applies.
It is extremely rare that an import doesn't generate any error messages. Most
of the time, these errors are unimportant. However, a final pass to tidy everything
up is almost always a necessity.
Amongst other things, if importing into existing structures, beware that imp
fires insert triggers and checks constraints, so disable everything first (scripts
to disable and re-enable triggers and constraints are available from this site).
There is no such problem when creating objects on the fly since data is inserted
before triggers are created and foreign keys are defined after everything has
been loaded.
What else can go wrong ? Lots of things ...
- Rollback
segment problems
By default, imp commits only when tables are totally loaded. Needless
to say, most of the times your rollback segments cannot cope. This is why
you should always use COMMIT=Y which forces a commit after each buffer
has been loaded ... BUT do not forget to massively increase the buffer
size ! (BUFFER parameter, see below)
- Segment
creation problems
This is the biggest problem you can encounter, with usually the error message
ORA-01658: unable to create INITIAL extent for segment in
tablespace ...
It is rare to have this when doing a full export/import, however
it may quite well happen when importing some tables only, if the tablespace
is highly fragmented (something can fit in in several chunks and not in a
single continuous one), or when reimporting data into an improperly sized
database (refreshing the data in a test database after months of intense activity
in the production database, for instance). More rarely you can also have
ORA-01950: no privileges on tablespace '...'
if the user has no quota on his/her default tablespace. This may sound
strange, but this is an excellent way to ensure that developers do not forget
the STORAGE clause when writing their DDL scripts (see the paper
on user management in this section), or, in the specific case of import, to
check one has not missed the creation of a tablespace in one of a series of
databases (development, test, production ...) supposed to have more or less
the same structures.
Note that such an error will of course cause a lot of errors, in the case
of tables errors ORA-00942 when trying to create indexes, grants, views, etc.
and in the case of an index associated with a primary key constraint errors
ORA-02270: no matching unique or primary key for this column-list
when trying to create related foreign keys. How can we get out of this
? Well, by recreating manually what has failed. Depending on the case, you
may wish to add a new file to the database or to modify the STORAGE
clause.
Now the problem is, do we have the SQL script to recreate the object ? If
not, you can use the imp error log, which has displayed the SQL (although
enclosing all lines between double quotes and arbitrarily cutting off the
statement at regular intervals). Unless your favorite past-time is reformating
SQL statements, a smarter way to proceed is to use the INDEXFILE=filename
option of imp. This generates a SQL file to recreate the indexes associated
with what is specified in the import (full file, some schemas, some tables
...). It is also useful to regenerate the CREATE TABLE since those
are also displayed although as comments. Unfortunately, this feature stubbornly
ignores indexes implicitly created by enabled primary keys or unique constraints.
This is why being able to regenerate DDL statements is a feature you often
find in 3rd party DBA tools, and why Oriole
has developed recreator which does it in batch and in the same fashion
as exp.
- Grant
failures
Typically messages such as
ORA-01917: user or role '...' does not exist
This may
occur when you import only a schema or tables. Either you don't give a damn
about the missing user or role, in which case you can ignore those (irritating)
error messages, or you do, in which case let imp proceed to the end,
create the missing user or role, and then run once again the same imp
with options IGNORE=Y (to limit error messages) and ROWS=N
(important!), either on the file which contains data or, preferably, and since
you obviously have followed the excellent advice above, the 'structure only'
export file.
- Failure
to compile stored objects
It is quite likely you will get
ORA-04063: view "..." has errors
and similar errors on packages and procedures. The reason is that exp
exports objects by type, and within a type by schema and in alphabetical order.
If view A references view Z, the creation of A will fail because the alphabetically-challenged
Z will be created much later ... Stupid isn't it? This is why stored objects
are created with the FORCE keyword which enables us to create an
object which has failed to compile - it is an acknowledgement that the invalidity
is a transient status; however when Z has been created correctly this will
not recompile A. Another, rarer case is linked to functions. Procedures, functions
and packages are, pretty logically, created after all views since they may
reference them. The problem is that a view can also refer to a user-defined
(and uncreated at the time of creation of the view) function.
Invalid objects are not terribly dangerous things, since Oracle is supposed
to try to recompile invalid objects dynamically when they are referenced.
This is at least the theory, and there have been releases when this theory
was proven false. Anyway, before releasing a freshly imported database to
users, it looks a little sloppy to have invalid objects in it; for one thing,
one likes to know whether it will recompile properly at runtime or whether
it will crash piteously. FREE scripts are available elsewhere on this website
to recompile all invalid objects, and they usually prove handy after a big
import.
Beware of one peculiar problem ; stored objects which reference private
database links always fail to compile during import, and, curiously, if SCOTT's
procedure P which refers to a private database link is recompiled by SYSTEM
using
ALTER PROCEDURE SCOTT.P COMPILE;
it will fail all the same, although it will succeed if done by SCOTT.
Private database links are about the only objects which cannot be prefixed
by their owner's name, and it may well have something to do with it. How to
fix the problem, then ? Well, this is another case when the script become.sql
available FREE from Oriole proves invaluable,
since it allows a DBA to connect as anybody without knowing nor changing the
password. All SYSTEM will have to do is
SQL> @become SCOTT
You are now connected as user SCOTT
SQL> ALTER PROCEDURE P COMPILE;
Procedure altered.
SQL>
The main
problem with imp is of course speed. The speed of exp has always
been satisfactory, even before DIRECT=Y - moreover, you can always
plan when you download the data, not always when you will have users on your
back asking you to reload it ASAP ! This is why once you have tasted the pdqout/sqlldr
(with the direct path) combination, it quickly becomes addictive ...
A
number of problems exp and imp can help you solve
- The classical
use of exp/imp is for reorganizing the data - defragmenting
the database. However, as pointed out above, one must be careful, especially
if you are defragmenting only parts of your database, and if some of the fragmentation
is deliberate, to spread a big table over a number of files.
- The other
classical use is creating or updating the data in a database from the data
in another one - such has updating the test database in order to be able to
reproduce a bug which occurs in production (it sometimes happens...). Once
again, you have to be careful, especially if what you are reimporting is a
schema. To be certain that the import goes well, the best thing to do is to
drop the objects before reimporting them. One way to do this is to drop the
user altogether with all his objects - however, if you have exported the schema,
you do not have in your export file the statement to recreate the Oracle account
and grant it the privileges it has, so if you don't have the scripts ready
you are likely to waste a lot of time in the operation. Don't drop only tables
either - if sequences were used to create data in them and if you have not
recreated the correct values for the sequences on the target database, you
can be certain you'll have primary key violations when trying to insert in
some tables. The best way to proceed is to drop ALL the objects in the schema
(while keeping the Oracle account) and then reimport into it. A script to
drop all of a user's objects is available FREE from Oriole.
- Real
reorganizations, meaning for instance migrating some tables from one
tablespace to another because you have added more disks to your system,
or partitioning big tables after an Oracle7 to Oracle8 migration are
another story. Basically, imp attempts to recreate everything
as is, defragmentation omitted. If you want to 'seriously' reorganize
your database, you must create all objects first from scripts before
reimporting with option IGNORE=Y to make imp load
data in existing tables. If you do not have the scripts, remember
that INDEXFILE=... can help, but do not forget that it will
not give you everything, most notably indexes associated with primary
key or unique constraints. The safest is to use a DDL generator such
as Oriole's recreator which outputs
a clean, SQL*Plus compatible .sql file.
- Remember
that by playing on users' quotas and default tablespaces you can redirect
the import into another tablespace. This, however, won't make you able to
redirect tables and indexes into different tablespaces in a single pass.
- The FROMUSER=...
TOUSER=... options allow you to change table and index ownership. You
can also peep into Oriole's X-rated section,
we won't report you.
- Once
you are converted to using sqlldr with the direct path option instead
of imp (and pdqout to replace exp) the old imp
can still have its use if you are running a pre-7.3 Oracle database and if
sqlldr fails, leaving indexes in direct load state - the INDEXFILE=...
option once again. From 7.3 on, ALTER INDEX ... REBUILD will do.
More useful
things to remember
- You'll
export and import noticeably faster if you are using (under Unix) the single-task
versions for both tools, which are named expst and impst. Those
versions are directly linked with the SGA and by-pass a number of inter-process
communication layers.
The single-task versions are not normally created during the installation.
To create them, you must do something such as :
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk singletask
The default
buffer size for exp and imp doesn't change often over the years,
and has long been a ridiculous (by today's standards) 64K. Don't hesitate
to allocate something more in line with the memory available on your computer
- 4096 and up is certainly recommended.
(not
really for beginners) working in parallel. When working on powerful hardware,
a multi-processor with a number of controllers, it's rather frustrating not
to use it fully. If you want to do it, the only way to do it is to work on
a table by table basis, and to start several processes at once. This is a
little complicated as far as job co-ordination is concerned but it's the best
way to make the most of your hardware.
And now
what about speed ?
Here are
the actual results of tests carried out on a medium sized table (about 120,000
rows) with two indexes. We have taken as reference (time=100) the time required
by the default use of exp with the conventional path. All defaults options
have been used, except for direct=Y where specified.
It demonstrates
clearly :
- The amazing
performance of exp direct=Y
- The benefits
coming from the single-task version, which are nil when a direct path is used
- That
imp is no match for sqlldr with the direct path option. However,
the conventional use of sqlldr performs hardly faster than imp
|