[ACCEPTED]-How can I read XMP data from a JPG with PHP?-xmp
XMP data is literally embedded into the 6 image file so can extract it with PHP's 5 string-functions from the image file itself.
The 4 following demonstrates this procedure (I'm 3 using SimpleXML but every other XML API or even simple 2 and clever string parsing may give you equal 1 results):
$content = file_get_contents($image);
$xmp_data_start = strpos($content, '<x:xmpmeta');
$xmp_data_end = strpos($content, '</x:xmpmeta>');
$xmp_length = $xmp_data_end - $xmp_data_start;
$xmp_data = substr($content, $xmp_data_start, $xmp_length + 12);
$xmp = simplexml_load_string($xmp_data);
Just two remarks:
- XMP makes heavy use of XML namespaces, so you'll have to keep an eye on that when parsing the XMP data with some XML tools.
- considering the possible size of image files, you'll perhaps not be able to use
file_get_contents()
as this function loads the whole image into memory. Usingfopen()
to open a file stream resource and checking chunks of data for the key-sequences<x:xmpmeta
and</x:xmpmeta>
will significantly reduce the memory footprint.
I'm only replying to this after so much 7 time because this seems to be the best result 6 when searching Google for how to parse XMP 5 data. I've seen this nearly identical snippet 4 used in code a few times and it's a terrible 3 waste of memory. Here is an example of the 2 fopen() method Stefan mentions after his 1 example.
<?php
function getXmpData($filename, $chunkSize)
{
if (!is_int($chunkSize)) {
throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
}
if ($chunkSize < 12) {
throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
}
if (($file_pointer = fopen($filename, 'r')) === FALSE) {
throw new RuntimeException('Could not open file for reading');
}
$startTag = '<x:xmpmeta';
$endTag = '</x:xmpmeta>';
$buffer = NULL;
$hasXmp = FALSE;
while (($chunk = fread($file_pointer, $chunkSize)) !== FALSE) {
if ($chunk === "") {
break;
}
$buffer .= $chunk;
$startPosition = strpos($buffer, $startTag);
$endPosition = strpos($buffer, $endTag);
if ($startPosition !== FALSE && $endPosition !== FALSE) {
$buffer = substr($buffer, $startPosition, $endPosition - $startPosition + 12);
$hasXmp = TRUE;
break;
} elseif ($startPosition !== FALSE) {
$buffer = substr($buffer, $startPosition);
$hasXmp = TRUE;
} elseif (strlen($buffer) > (strlen($startTag) * 2)) {
$buffer = substr($buffer, strlen($startTag));
}
}
fclose($file_pointer);
return ($hasXmp) ? $buffer : NULL;
}
A simple way on linux is to call the exiv2 3 program, available in an eponymous package 2 on debian.
$ exiv2 -e X extract image.jpg
will produce image.xmp containing 1 embedded XMP which is now yours to parse.
I know... this is kind of an old thread, but 7 it was helpful to me when I was looking 6 for a way to do this, so I figured this 5 might be helpful to someone else.
I took 4 this basic solution and modified it so it 3 handles the case where the tag is split 2 between chunks. This allows the chunk size 1 to be as large or small as you want.
<?php
function getXmpData($filename, $chunk_size = 1024)
{
if (!is_int($chunkSize)) {
throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
}
if ($chunkSize < 12) {
throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
}
if (($file_pointer = fopen($filename, 'rb')) === FALSE) {
throw new RuntimeException('Could not open file for reading');
}
$tag = '<x:xmpmeta';
$buffer = false;
// find open tag
while ($buffer === false && ($chunk = fread($file_pointer, $chunk_size)) !== false) {
if(strlen($chunk) <= 10) {
break;
}
if(($position = strpos($chunk, $tag)) === false) {
// if open tag not found, back up just in case the open tag is on the split.
fseek($file_pointer, -10, SEEK_CUR);
} else {
$buffer = substr($chunk, $position);
}
}
if($buffer === false) {
fclose($file_pointer);
return false;
}
$tag = '</x:xmpmeta>';
$offset = 0;
while (($position = strpos($buffer, $tag, $offset)) === false && ($chunk = fread($file_pointer, $chunk_size)) !== FALSE && !empty($chunk)) {
$offset = strlen($buffer) - 12; // subtract the tag size just in case it's split between chunks.
$buffer .= $chunk;
}
fclose($file_pointer);
if($position === false) {
// this would mean the open tag was found, but the close tag was not. Maybe file corruption?
throw new RuntimeException('No close tag found. Possibly corrupted file.');
} else {
$buffer = substr($buffer, 0, $position + 12);
}
return $buffer;
}
?>
Bryan's solution was the best one so far, but 32 it had a few issues so I modified it to 31 simplify it, and remove some functionality.
There 30 were three issues I found with his solution:
A) If 29 the chunk extracted falls right in between 28 one of the strings we're searching for, it 27 won't find it. Small chunk sizes are more 26 likely to cause this issue.
B) If the chunk 25 contains both the start AND the end, it 24 won't find it. This is an easy one to fix 23 with an extra if statement to recheck the 22 chunk that the start is found in to see 21 if the end is also found.
C) The else statement 20 added to the end to break the while loop 19 if it doesn't find the xmp data has a side 18 effect that if the start element isn't found 17 on the first pass, it will not check anymore 16 chunks. This is likely easy to fix too, but 15 with the first issue it's not worth it.
My 14 solution below isn't as powerful, but it's 13 more robust. It will only check one chunk, and 12 extract the data from that. It will only 11 work if the start and end are in that chunk, so 10 the chunk size needs to be large enough 9 to ensure that it always captures that data. From 8 my experience with Adobe Photoshop/Lightroom 7 exported files, the xmp data typically starts 6 at around 20kB, and ends at around 45kB. My 5 chunk size of 50k seems to work nicely for 4 my images, it would be much less if you 3 strip some of that data on export, such 2 as the CRS block that has a lot of develop 1 settings.
function getXmpData($filename)
{
$chunk_size = 50000;
$buffer = NULL;
if (($file_pointer = fopen($filename, 'r')) === FALSE) {
throw new RuntimeException('Could not open file for reading');
}
$chunk = fread($file_pointer, $chunk_size);
if (($posStart = strpos($chunk, '<x:xmpmeta')) !== FALSE) {
$buffer = substr($chunk, $posStart);
$posEnd = strpos($buffer, '</x:xmpmeta>');
$buffer = substr($buffer, 0, $posEnd + 12);
}
fclose($file_pointer);
return $buffer;
}
Thank you Sebastien B. for that shortened 3 version :). If you want to avoid the problem, when 2 chunk_size is just too small for some files, just 1 add recursion.
function getXmpData($filename, $chunk_size = 50000){
$buffer = NULL;
if (($file_pointer = fopen($filename, 'r')) === FALSE) {
throw new RuntimeException('Could not open file for reading');
}
$chunk = fread($file_pointer, $chunk_size);
if (($posStart = strpos($chunk, '<x:xmpmeta')) !== FALSE) {
$buffer = substr($chunk, $posStart);
$posEnd = strpos($buffer, '</x:xmpmeta>');
$buffer = substr($buffer, 0, $posEnd + 12);
}
fclose($file_pointer);
// recursion here
if(!strpos($buffer, '</x:xmpmeta>')){
$buffer = getXmpData($filename, $chunk_size*2);
}
return $buffer;
}
I've developped the Xmp Php Tookit extension 9 : it's a php5 extension based on the adobe 8 xmp toolkit, which provide the main classes 7 and method to read/write/parse xmp metadatas 6 from jpeg, psd, pdf, video, audio... This 5 extension is under gpl licence. A new release 4 will be available soon, for php 5.3 (now 3 only compatible with php 5.2.x), and should 2 be available on windows and macosx (now 1 only for freebsd and linux systems). http://xmpphptoolkit.sourceforge.net/
If you have ExifTool available (a very useful 4 tool) and can run external commands, you 3 can use it's option to extract XMP data 2 (-xmp:all
) and output it in JSON format (-json
), which 1 you can then easily convert to a PHP object:
$command = 'exiftool -g -json -struct -xmp:all "'.$image_path.'"';
exec($command, $output, $return_var);
$metadata = implode('', $output);
$metadata = json_decode($metadata);
There is now also a github repo you can 1 add via composer that can read xmp data:
https://github.com/jeroendesloovere/xmp-metadata-extractor
composer require jeroendesloovere/xmp-metadata-extractor
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.